TOC 
DraftB. Fitzpatrick
 B. Slatkin
 Google, Inc
 M. Atkins
 Six Apart Ltd.
 February 8, 2010


PubSubHubbub Core 0.3 -- Working Draft

Abstract

An open, simple, web-scale pubsub protocol, along with an open source reference implentation targeting Google App Engine. Notably, however, nothing in the protocol is centralized, or Google- or App Engine-specific. Anybody can play.

As opposed to more developed (and more complex) pubsub specs like Jabber Publish-Subscribe (Millard, P., Saint-Andre, P., and R. Meijer, “Publish-Subscribe,” .) [XEP‑0060] this spec's base profile (the barrier-to-entry to speak it) is dead simple. The fancy bits required for high-volume publishers and subscribers are optional. The base profile is HTTP-based, as opposed to XMPP (see more on this below).

To dramatically simplify this spec in several places where we had to choose between supporting A or B, we took it upon ourselves to say "only A", rather than making it an implementation decision.

We offer this spec in hopes that it fills a need or at least advances the state of the discussion in the pubsub space. Polling sucks. We think a decentralized pubsub layer is a fundamental, missing layer in the Internet architecture today and its existence, more than just enabling the obvious lower latency feed readers, would enable many cool applications, most of which we can't even imagine. But we're looking forward to decentralized social networking.



Table of Contents

1.  Notation and Conventions
2.  Definitions
3.  High-level protocol flow
4.  Atom Details
5.  Discovery
6.  Subscribing and Unsubscribing
    6.1.  Subscriber Sends Subscription Request
    6.2.  Hub Verifies Intent of the Subscriber
    6.3.  Automatic Subscription Refreshing
7.  Publishing
    7.1.  New Content Notification
    7.2.  Content Fetch
    7.3.  Content Distribution
    7.4.  Authenticated Content Distribution
    7.5.  Aggregated Content Distribution
8.  Best Practices
    8.1.  For Hubs
    8.2.  For Subscribers
9.  References
Appendix A.  Specification Feedback
§  Authors' Addresses




 TOC 

1.  Notation and Conventions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119] (Bradner, B., “Key words for use in RFCs to Indicate Requirement Levels,” .). Domain name examples use [RFC2606] (Eastlake, D. and A. Panitz, “Reserved Top Level DNS Names,” .).



 TOC 

2.  Definitions

Topic:
An Atom (Nottingham, M., Ed. and R. Sayre, Ed., “The Atom Syndication Format,” .) [RFC4287] or RSS (Winer, D., “RSS 2.0,” .) [RSS20] feed URL (Berners-Lee, T., “Uniform Resource Identifiers (URI): Generic Syntax,” .) [RFC3986]. The unit to which one can subscribe to changes. This spec currently only addresses feed URLs that require no additional authorization headers.
Hub ("the hub"):
The server (URL (Berners-Lee, T., “Uniform Resource Identifiers (URI): Generic Syntax,” .) [RFC3986]) which implements both sides of this protocol. We have implemented this and are running a server at http://pubsubhubbub.appspot.com that is, at least for now, open to anybody for use, as either a publisher or subscriber. Any hub MAY implement its own policies on who can use it.
Publisher:
An owner of a topic. Notifies the hub when the topic feed has been updated. It just notifies that it has been updated, but not how. As in almost all pubsub systems, the publisher is unaware of the subscribers, if any. Other pubsub systems might call the publisher the "source".
Subscriber:
An entity (person or program) that wants to be notified of changes on a topic. The subscriber must be directly network-accessible and is identified by its Subscriber Callback URL.
Subscription:
A unique relation to a topic by a subscriber that indicates it should receive updates for that topic. A subscription's unique key is the tuple (Topic URL, Subscriber Callback URL). Subscriptions may (at the hub's decision) have expiration times akin to DHCP leases which must be periodically renewed.
Subscriber Callback URL:
The URL (Berners-Lee, T., “Uniform Resource Identifiers (URI): Generic Syntax,” .) [RFC3986] at which a subscriber wishes to receive notifications.
Event:
An event that causes updates to multiple topics. For each event that happens (e.g. "Brad posted to the Linux Community."), multiple topics could be affected (e.g. "Brad posted." and "Linux community has new post"). Publisher events cause topics to be updated and the hub looks up all subscriptions for affected topics, sending out notifications to subscribers.
Notification:
A payload describing how a topic's contents have changed. This difference (or "delta") is computed by the hub and sent to all subscribers. The format of the notification will be an Atom or RSS feed served by the publisher with only those entries present which are new or have changed. The notification can be the result of a publisher telling the hub of an update, or the hub proactively polling a topic feed, perhaps for a subscriber subscribing to a topic that's not pubsub-aware. Note also that a notification to a subscriber can be a payload consisting of updates for multiple topics. Hubs MAY choose to send multi-topic notifications as an optimization for heavy subscribers, but subscribers MUST understand them. See Section 7.3 (Content Distribution) for format details.


 TOC 

3.  High-level protocol flow

(This section is non-normative.)



 TOC 

4.  Atom Details

Notification and source formats will be Atom (Nottingham, M., Ed. and R. Sayre, Ed., “The Atom Syndication Format,” .) [RFC4287] or RSS (Winer, D., “RSS 2.0,” .) [RSS20]. The Publisher makes the decision as to include full body, truncated body, or meta data of most recent event(s). One of:

The trade-off between including all content in outgoing notifications or having the thundering herd (by clients who fetch the //atom:feed/entry/link in response to a notification) is up to the publisher. Entries of the most recent events (for recipient to know whether or not they'd missed any recent items-- like TCP SACK) MAY be provided as context. Some examples of Atom feed entries follow.

<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <!-- Normally here would be source, title, author, id, etc ... -->

  <link rel="hub" href="http://myhub.example.com/endpoint" />
  <link rel="self" href="http://publisher.example.com/happycats.xml" />
  <updated>2008-08-11T02:15:01Z</updated>

  <!-- Example of a full entry. -->
  <entry>
    <title>Heathcliff</title>
    <link href="http://publisher.example.com/happycat25.xml" />
    <id>http://publisher.example.com/happycat25.xml</id>
    <updated>2008-08-11T02:15:01Z</updated>
    <content>
      What a happy cat. Full content goes here.
    </content>
  </entry>

  <!-- Example of an entity that isn't full/is truncated. This is implied
       by the lack of a <content> element and a <summary> element instead. -->
  <entry >
    <title>Heathcliff</title>
    <link href="http://publisher.example.com/happycat25.xml" />
    <id>http://publisher.example.com/happycat25.xml</id>
    <updated>2008-08-11T02:15:01Z</updated>
    <summary>
      What a happy cat!
    </summary>
  </entry>

  <!-- Meta-data only; implied by the lack of <content> and
       <summary> elements. -->
  <entry>
    <title>Garfield</title>
    <link rel="alternate" href="http://publisher.example.com/happycat24.xml" />
    <id>http://publisher.example.com/happycat25.xml</id>
    <updated>2008-08-11T02:15:01Z</updated>
  </entry>

  <!-- Context entry that's meta-data only and not new. -->
  <entry>
    <title>Nermal</title>
    <link rel="alternate" href="http://publisher.example.com/happycat23s.xml" />
    <id>http://publisher.example.com/happycat25.xml</id>
    <updated>2008-07-10T12:28:13Z</updated>
  </entry>

</feed>


 TOC 

5.  Discovery

A potential subscriber initiates discovery by retrieving the feed to which it wants to subscribe. A feed that acts as a topic as per this specification MUST publish, as a child of //atom:feed or //rss:rss/channel , an atom:link element whose rel attribute has the value hub and whose href attribute contains the hub's endpoint URL. Feeds MAY contain multiple atom:link[@rel="hub"] elements if the publisher wishes to notify multiple hubs. When a potential subscriber encounters one or more such links, that subscriber MAY subscribe to the feed using one or more hubs URLs as described in Section 6 (Subscribing and Unsubscribing).

Example:

<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <!-- Normally here would be source, title, author, id, etc ... -->
  <link rel="hub" href="https://myhub.example.com/endpoint" />
  <link rel="self" href="http://publisher.example.com/topic.xml" />
  ....
  <entry>
     ....
  </entry>
  <entry>
     ....
  </entry>
</feed>

Hubs MUST use the same URL for both the publishing and subscribing interfaces, which is why only a single atom:link element is required to declare a hub. Publishers SHOULD use HTTPS (Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and T. Berners-Lee, “Hypertext Transfer Protocol -- HTTP/1.1,” .) [RFC2616] in their hubs' discovery URLs. However, subscribers that do not support HTTPS (Rescorla, E., “HTTP Over TLS,” May 2000.) [RFC2818] MAY try to fallback to HTTP (Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and T. Berners-Lee, “Hypertext Transfer Protocol -- HTTP/1.1,” .) [RFC2616], which MAY work depending on the hub's policy.



 TOC 

6.  Subscribing and Unsubscribing

Subscribing to a topic URL consists of three parts that may occur immediately in sequence or have a delay.

Unsubscribing works in the same way, except with a single parameter changed to indicate the desire to unsubscribe.



 TOC 

6.1.  Subscriber Sends Subscription Request

Subscription is initiated by the subscriber making an HTTP (Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and T. Berners-Lee, “Hypertext Transfer Protocol -- HTTP/1.1,” .) [RFC2616] POST request to the hub URL. This request has a Content-Type of application/x-www-form-urlencoded (described in Section 17.13.4 of [W3C.REC‑html401‑19991224] (Raggett, D., Hors, A., and I. Jacobs, “HTML 4.01 Specification,” December 1999.)) and the following parameters in its body:

hub.callback
REQUIRED. The subscriber's callback URL where notifications should be delivered.
hub.mode
REQUIRED. The literal string "subscribe" or "unsubscribe", depending on the goal of the request.
hub.topic
REQUIRED. The topic URL that the subscriber wishes to subscribe to.
hub.verify
REQUIRED. Keyword describing verification modes supported by this subscriber, as described below. This parameter may be repeated to indicate multiple supported modes.
hub.lease_seconds
OPTIONAL. Number of seconds for which the subscriber would like to have the subscription active. If not present or an empty value, the subscription will be permanent (or active until automatic refreshing (Automatic Subscription Refreshing) removes the subscription). Hubs MAY choose to respect this value or not, depending on their own policies. This parameter MAY be present for unsubscription requests and MUST be ignored by the hub in that case.
hub.secret
OPTIONAL. A subscriber-provided secret string that will be used to compute an HMAC digest for authorized content distribution (Authenticated Content Distribution). If not supplied, the HMAC digest will not be present for content distribution requests. This parameter SHOULD only be specified when the request was made over HTTPS (Rescorla, E., “HTTP Over TLS,” May 2000.) [RFC2818]. This parameter MUST be less than 200 bytes in length.
hub.verify_token
OPTIONAL. A subscriber-provided opaque token that will be echoed back in the verification request to assist the subscriber in identifying which subscription request is being verified. If this is not included, no token will be included in the verification request.

The following keywords are supported for hub.verify:

sync
The subscriber supports synchronous verification, where the verification request must occur before the subscription request's HTTP response is returned.
async
The subscriber supports asynchronous verification, where the verification request may occur at a later point after the subscription request has returned.

Where repeated keywords are used, their order indicates the subscriber's order of preference. Subscribers MUST use at least one of the modes indicated in the list above, but MAY include additional keywords defined by extension specifications. Hubs MUST ignore verify mode keywords that they do not understand.

Hubs MUST ignore additional request parameters they do not understand.

Hubs MUST allow subscribers to re-request subscriptions that are already activate. Each subsequent request to a hub to subscribe or unsubscribe MUST override the previous subscription state for a specific topic URL and callback URL combination once the action is verified. Any failures to confirm the subscription action MUST leave the subscription state unchanged. This is required so subscribers can renew their subscriptions before the lease seconds period is over without any interruption.



 TOC 

6.1.1.  Subscription Parameter Details

The topic and callback URLs MUST NOT contain an anchor fragment. These URLs MAY use HTTP (Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and T. Berners-Lee, “Hypertext Transfer Protocol -- HTTP/1.1,” .) [RFC2616] or HTTPS (Rescorla, E., “HTTP Over TLS,” May 2000.) [RFC2818] schemes. These URLs MAY have port numbers specified; however, hubs MAY choose to disallow certain ports based on their own policies (e.g., security) and return errors for these requests. The topic URL can otherwise be free-form following the URI spec (Berners-Lee, T., “Uniform Resource Identifiers (URI): Generic Syntax,” .) [RFC3986]. Hubs MUST always decode non-reserved characters for these URL parameters; see section 2.4 on "When to Encode or Decode" in the URI spec (Berners-Lee, T., “Uniform Resource Identifiers (URI): Generic Syntax,” .) [RFC3986].

The callback URL MAY contain arbitrary query string parameters (e.g., ?foo=bar&red=fish). Hubs MUST preserve the query string during subscription verification by appending new parameters to the end of the list using the & (ampersand) character to join. Existing parameters with names that overlap with those used by verification requests will not be overwritten; Hubs MUST only append verification parameters to the existing list, if any. For event notification, the callback URL will be POSTed to including any query-string parameters in the URL portion of the request, not as POST body parameters.

Subscribers MAY choose to use HTTPS (Rescorla, E., “HTTP Over TLS,” May 2000.) [RFC2818] for their callback URLs if they care about the privacy of notifications as they come over the wire from the Hub. The use of mechanisms (such as XML signatures) to verify the integrity of notifications coming from the original publisher is out of the scope of this specification.



 TOC 

6.1.2.  Subscription Response Details

The hub MUST respond to a subscription request with an HTTP 204 "No Content" response to indicate that the request was verified and that the subscription is active. If the subscription has yet to be verified (i.e., the hub is using asynchronous verification), the hub MUST respond with a 202 "Accepted" code.

If a hub finds any errors in the subscription request, an appropriate HTTP error response code (4xx or 5xx) MUST be returned. In the event of an error, hubs SHOULD return a description of the error in the response body as plain text. Hubs MAY decide to reject some callback URLs or topic URLs based on their own policies (e.g., domain authorization, topic URL port numbers).

In synchronous mode, the verification (Section 6.2 (Hub Verifies Intent of the Subscriber)) MUST be completed before the hub returns a response. In asynchronous mode, the verification MAY be deferred until a later time. This is useful to enable hubs to defer work; this could allow them to alleviate servers under heavy load or do verification work in batches.



 TOC 

6.2.  Hub Verifies Intent of the Subscriber

In order to prevent an attacker from creating unwanted subscriptions on behalf of a subscriber (or unsubscribing desired ones), a hub must ensure that the subscriber did indeed send the subscription request.

The hub verifies a subscription request by sending an HTTP (Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and T. Berners-Lee, “Hypertext Transfer Protocol -- HTTP/1.1,” .) [RFC2616] GET request to the subscriber's callback URL as given in the subscription request. This request has the following query string arguments appended (format described in Section 17.13.4 of [W3C.REC‑html401‑19991224] (Raggett, D., Hors, A., and I. Jacobs, “HTML 4.01 Specification,” December 1999.)):

hub.mode
REQUIRED. The literal string "subscribe" or "unsubscribe", which matches the original request to the hub from the subscriber.
hub.topic
REQUIRED. The topic URL given in the corresponding subscription request.
hub.challenge
REQUIRED. A hub-generated, random string that MUST be echoed by the subscriber to verify the subscription.
hub.lease_seconds
REQUIRED/OPTIONAL. The hub-determined number of seconds that the subscription will stay active before expiring, measured from the time the verification request was made from the hub to the subscriber. Hubs MUST supply this parameter for subscription requests. This parameter MAY be present for unsubscribe requests and MUST be ignored by subscribers during unsubscription.
hub.verify_token
OPTIONAL. The subscriber-provided opaque token from the corresponding subscription request, if one was provided.


 TOC 

6.2.1.  Verification Details

The subscriber MUST confirm that the hub.topic and hub.verify_token correspond to a pending subscription or unsubscription that it wishes to carry out. If so, the subscriber MUST respond with an HTTP success (2xx) code with a response body equal to the hub.challenge parameter. If the subscriber does not agree with the action, the subscriber MUST respond with a 404 "Not Found" response.

For synchronous verification, the hub MUST consider other server response codes (3xx, 4xx, 5xx) to mean that the verification request has immediately failed and no retries should occur. If the subscriber returns an HTTP success (2xx) but the content body does not match the hub.challenge parameter, the hub MUST also consider verification to have failed.

For asynchronous verification, the hub MUST consider other server response codes (3xx, 4xx, and 5xx) to mean that the subscription action was temporarily not verified. If the subscriber returns an HTTP success (2xx) but the content body does not match the hub.challenge parameter, the hub MUST consider this to be a temporary failure and retry. The hub SHOULD retry verification a reasonable number of times over the course of a longer time period (e.g., 6 hours) until a definite acknowledgement (positive or negative) is received. If a definite response still cannot be determined after this retry period, the subscription action verification MUST be abandoned, leaving the previous subscription state.

Hubs MAY make the hub.lease_seconds equal to the period the subscriber passed in their subscription request but MAY change the value depending on the hub's policies. To sustain a temporary subscription, the subscriber MUST re-request the subscription on the hub before hub.lease_seconds seconds has elapsed. For permanent subscriptions with no hub.lease_seconds value specified, the behavior is different as described in the section on automatic subscription refreshing (Automatic Subscription Refreshing).



 TOC 

6.3.  Automatic Subscription Refreshing

Before a subscription expires (i.e., before hub.lease_seconds elapses), Hubs MUST recheck with subscribers to see if a continued subscription is desired. Hubs do this by sending the subscriber a verification request (Hub Verifies Intent of the Subscriber) with hub.mode equal to subscribe. This request MUST match the original verification request sent to the subscriber (but with a new hub.challenge).

The response codes returned by the subscriber MUST be interpreted the same way as during a subscriber-initiated verification flow. However, this refresh request MUST behave like an initial subscription request; this means that if an auto-refresh response from the subscriber constantly returns an error, the hub MUST give up on the subscription verification action altogether and remove the subscription.

In the case of permanent subscriptions (with no hub.lease_seconds specified in the original request), the hub.lease_seconds value supplied by the hub in the verification request to the subscriber SHOULD represent how many seconds until the hub expects it will next initiate automatic subscription refreshing to ensure that the subscriber is still interested in the topic. This behavior provides the best of both worlds: maximum simplicity of the subscriber through infinitely-long subscriptions, but still garbage collectable subscriptions for hub hygiene.



 TOC 

7.  Publishing

A publisher pings the hub with the topic URL(s) which have been updated and the hub schedules those topics to be fetched and delivered. Because it's just a ping to notify the hub of the topic URL (without a payload), no authentication from the publisher is required.



 TOC 

7.1.  New Content Notification

When new content is added to a feed, a notification is sent to the hub by the publisher. The hub MUST accept a POST request to the hub URL containing the notification. This request MUST have a Content-Type of application/x-www-form-urlencoded (described in Section 17.13.4 of [W3C.REC‑html401‑19991224] (Raggett, D., Hors, A., and I. Jacobs, “HTML 4.01 Specification,” December 1999.)) and the following parameters in its body:

hub.mode
REQUIRED. The literal string "publish".
hub.url
REQUIRED. The topic URL of the topic that has been updated. This field may be repeated to indicate multiple topics that have been updated.

The new content notification is a signal to the hub that there is new content available. The hub SHOULD arrange for a content fetch request (Section 7.2 (Content Fetch)) to be performed in the near future to retrieve the new content. If the notification was acceptable, the hub MUST return a 204 No Content response. If the notification is not acceptable for some reason, the hub MUST return an appropriate HTTP error response code (4xx and 5xx). Hubs MUST return a 204 No Content response even when they do not have any subscribers for all of the specified topic URLs.



 TOC 

7.2.  Content Fetch

When the hub wishes to retrieve new content for a topic, the hub sends an HTTP (Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and T. Berners-Lee, “Hypertext Transfer Protocol -- HTTP/1.1,” .) [RFC2616] GET request to the topic URL. The hub MUST follow HTTP redirects. The Hub SHOULD use best practices for caching headers in its requests (e.g., If-None-Match, If-Modified-Since).

The request SHOULD include the header field User-Agent in the form expected by HTTP (Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and T. Berners-Lee, “Hypertext Transfer Protocol -- HTTP/1.1,” .) [RFC2616]. The header MAY have one or more additional suffixes like (example.com; 20 subscribers) to indicate the number of subscriptions the hub has active for this topic.

If present, the first suffix MUST indicate the total number of subscriptions the hub has aggregated across all subscriber domains by means of the X-Hub-On-Behalf-Of header (Section 7.3 (Content Distribution)). Any additional suffixes indicate a breakdown of subscriber counts across subscriber domains. This allows content publishers to distinguish the source of their subscriber counts and mitigate subscriber count spam. An example header could be (ignoring line-wrapping):

User-Agent: MyHub (+http://hub.example.com; 26 subscribers)
    (sub.example.com; 4 subscribers)
    (other-sub.example.com; 22 subscribers)

If, after a content fetch, the hub determines that the topic feed content has changed, the hub MUST send information about the changes to each of the subscribers to the topic (Section 7.3 (Content Distribution)). Hubs MUST consider new feed entries, updated entries, or changes to the surrounding feed document as significant content changes that require content distribution.



 TOC 

7.3.  Content Distribution

A content distribution request is an HTTP (Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and T. Berners-Lee, “Hypertext Transfer Protocol -- HTTP/1.1,” .) [RFC2616] POST request from hub to the subscriber's callback URL with the list of new and changed entries. This request MUST have a Content-Type of application/atom+xml when the request body is an Atom (Nottingham, M., Ed. and R. Sayre, Ed., “The Atom Syndication Format,” .) [RFC4287] feed document, or a Content-Type of application/rss+xml when the request body is an RSS (Winer, D., “RSS 2.0,” .) [RSS20] feed document. The behavior for other content types is not yet defined. Hubs MAY transform the content type and request body as desired (e.g., for language translation).

If the document represents a single feed being replicated for the subscriber, then the feed-level elements SHOULD be preserved aside from the atom:entry or rss:item elements. However, the atom:id element MUST be reproduced exactly. The other atom:updated and atom:title elements required by the Atom specification SHOULD be present. Each atom:entry or rss:item element in the feed contains the content from an entry in the single topic that the subscriber has an active subscription for. Essentially, in the single feed case the subscriber will receive a feed document that looks like the original but with old content removed.

The successful response from the subscriber's callback URL MUST be an HTTP success (2xx) code. The hub MUST consider all other subscriber response codes as failures; that means subscribers MUST not use HTTP redirects for moving subscriptions. The response body from the subscriber MUST be ignored by the hub. Hubs SHOULD retry notifications repeatedly until successful (up to some reasonable maximum over a reasonable time period). Subscribers SHOULD respond to notifications as quickly as possible; their success response code SHOULD only indicate receipt of the message, not acknowledgment that it was successfully processed by the subscriber.

The subscriber's callback response MAY include the header field X-Hub-On-Behalf-Of with an integer value, possibly approximate, representing the number of subscribers on behalf of which this feed notification was delivered. This value SHOULD be aggregated by the hub across all subscribers and used to provide the subscriber counts in the User-Agent header field sent to publishers (Section 7.2 (Content Fetch)). Hubs MAY ignore or respect X-Hub-On-Behalf-Of values from subscribers depending on their own policies (i.e., to prevent spam).



 TOC 

7.4.  Authenticated Content Distribution

If the subscriber supplied a value for hub.secret in their subscription request, the hub MUST generate an HMAC signature of the payload and include that signature in the request headers of the content distribution request. The X-Hub-Signature header's value MUST be in the form sha1=signature where signature is a 40-byte, hexadecimal representation of a SHA1 signature (Eastlake, D. and P. Jones, “US Secure Hash Algorithm 1 (SHA1),” September 2001.) [RFC3174]. The signature MUST be computed using the HMAC algorithm (Krawczyk, H., Bellare, M., and R. Canetti, “HMAC: Keyed-Hashing for Message Authentication,” .) [RFC2104] with the request body as the data and the hub.secret as the key.

When subscribers receive a content distribution request with the X-Hub-Signature header specified, they SHOULD recompute the SHA1 signature with the shared secret using the same method as the hub. If the signature does not match, subscribers MUST still return a 2xx success response to acknowledge receipt, but locally ignore the message as invalid. Using this technique along with HTTPS for subscription requests enables simple subscribers to receive authenticated content delivery from hubs without the need for subscribers to run an HTTPS server.



 TOC 

7.5.  Aggregated Content Distribution

For Atom feeds only. Pending further review.

When a subscriber indicates the same callback URL is used for multiple subscriptions, hubs MAY choose to combine content delivery requests into a single payload containing an aggregated set of feeds. This bulk delivery results in fewer requests and more efficient distribution. If the subscriber indicated a hub.secret value for these overlapping subscriptions, the secret MUST also be the same for all subscriptions. This allows the hub to generate a single X-Hub-Signature header to sign the entire payload. Hubs MUST return an error response (4xx, 5xx) for subscription requests with overlapping callback URLs and different secret values.

With an aggregated set of feeds, the hub SHOULD reproduce all of the elements from the source feed inside the corresponding atom:entry in the content distribution request by using an atom:source element. However, the atom:id value MUST be reproduced exactly within the source element. If the source entry does not have an atom:source element, the hub MUST create an atom:source element containing the atom:id element. The hub SHOULD also include the atom:title element and an atom:link element with rel="self" values that are functionally equivalent to the corresponding elements in the original topic feed.

Example aggregated feed:

<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>Aggregated feed</title>
  <updated>2008-08-11T02:17:44Z</updated>
  <id>http://myhub.example.com/aggregated?1232427842-39823</id>

  <entry>
    <source>
      <id>http://www.example.com/foo</id>
      <link rel="self" href="http://publisher.example.com/foo.xml" />
      <author>
        <name>Mr. Bar</name>
      </author>
    </source>
    <title>Testing Foo</title>
    <link href="http://publisher.example.com/foo24.xml" />
    <id>http://publisher.example.com/foo24.xml</id>
    <updated>2008-08-11T02:15:01Z</updated>
    <content>
      This is some content from the user named foo.
    </content>
  </entry>

  <entry>
    <source>
      <id>http://www.example.com/bar</id>
      <link rel="self" href="http://publisher.example.com/bar.xml" />
      <author>
        <name>Mr. Bar</name>
      </author>
    </source>
    <title>Testing Bar</title>
    <link href="http://publisher.example.com/bar18.xml" />
    <id>http://publisher.example.com/bar18.xml</id>
    <updated>2008-08-11T02:17:44Z</updated>
    <content>
      Some data from the user named bar.
    </content>
  </entry>

<feed>


 TOC 

8.  Best Practices

(This section is non-normative.)



 TOC 

8.1.  For Hubs

The vast majority of feeds on the web have multiple variants that contain essentially the same content but appear on different URLs. A common set of variants is http://example.com/feed?format=atom and http://example.com/feed?format=rss. In practice, these URLs also fail to set their //atom:feed/link[@rel="self"] values properly, meaning it's difficult for subscribers to discover which feed URL they truly wanted. Making matters worse, feeds often have redirects (temporary or permanent) to new hosting locations, analytics providers, or other feed-processors. To solve this, it's important for Hub implementations to determine feed URL equivalences using heuristics. Examples: follow all feed URLs redirects and see if they end up at the same location; use overlaps in feed HTML alternate links; use the atom:id value across all domains. This specific problem is considered out of scope of this specification but this section is meant as a reminder to implementors that feed URL aliasing is an important issue that hubs should address instead of putting the burden on publishers and subscribers.



 TOC 

8.2.  For Subscribers

The hub.verify_token parameter in subscription requests enables subscribers to verify the identity and intent of the hub making the verification request. Subscribers should use the token to retrieve internal state to ensure the subscription request outcome is what they intended.



 TOC 

9. References

[RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, “HMAC: Keyed-Hashing for Message Authentication,” RFC 2104.
[RFC2119] Bradner, B., “Key words for use in RFCs to Indicate Requirement Levels,” RFC 2119.
[RFC2606] Eastlake, D. and A. Panitz, “Reserved Top Level DNS Names,” RFC 2606.
[RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and T. Berners-Lee, “Hypertext Transfer Protocol -- HTTP/1.1,” RFC 2616.
[RFC2818] Rescorla, E., “HTTP Over TLS,” RFC 2818, May 2000 (TXT).
[RFC3174] Eastlake, D. and P. Jones, “US Secure Hash Algorithm 1 (SHA1),” RFC 3174, September 2001 (TXT).
[RFC3986] Berners-Lee, T., “Uniform Resource Identifiers (URI): Generic Syntax,” RFC 3986.
[RFC4287] Nottingham, M., Ed. and R. Sayre, Ed., “The Atom Syndication Format,” RFC 4287 (HTML).
[RSS20] Winer, D., “RSS 2.0.”
[W3C.REC-html401-19991224] Raggett, D., Hors, A., and I. Jacobs, “HTML 4.01 Specification,” World Wide Web Consortium Recommendation REC-html401-19991224, December 1999 (HTML).
[XEP-0060] Millard, P., Saint-Andre, P., and R. Meijer, “Publish-Subscribe,” XSF XEP 0060.


 TOC 

Appendix A.  Specification Feedback

Feedback on this specification is welcomed via the pubsubhubbub mailing list, pubsubhubbub@googlegroups.com. For more information, see the PubSubHubbub group on Google Groups. Also, check out the FAQ and other documentation.



 TOC 

Authors' Addresses

  Brad Fitzpatrick
  Google, Inc
Email:  brad@danga.com
  
  Brett Slatkin
  Google, Inc
Email:  bslatkin@gmail.com
  
  Martin Atkins
  Six Apart Ltd.
Email:  mart@degeneration.co.uk