Posts Tagged ‘http’
Scala introduction — writing an OAuth library
I started out intending to write some scala examples against the twitter API, however I soon discovered I needed OAuth first. Given that I use OAuth all the time at work I figured I could probably do with learning about it first-hand, while learning scala.
org.junit.rules._
I chose to test drive it with JUnit 4.7 and NetBeans.
NetBeans works almost immediately with scala, and has support for project templates etc — even scala JUnit fixtures.
UPDATE (2010-04-27) I have since discovered IntelliJ to be much better, and there is now a free community edition. IntelliJ supports scala without any fiddling around.
JUnit mostly works, though rules don’t and neither do some matchers. Even though rules don’t work, I have included it anyway because I have the t-shirt.
You can find the project on github.
Important abstractions
- SignatureBaseString.
- Characterized by three ampersand-separated segments: verb, uri, parameters.
- URL Encoding must conform to RFC 3986, and the following characters should are consider unreserved so should not be encoded:
ALPHA, DIGIT, ‘-‘, ‘.’, ‘_’, ‘~’
- Signature.
- Signature is a keyed-Hash Message Authentication Code (HMAC).
- Consumer secret required part of HMAC secret key.
- Token secret is optionally included in HMAC secret key:
(consumer_secret, token_secret) => uri_encoded_consumer_secret&[uri_encoded_token_secret]
- OAuthCredential. Represents the secret key(s) used to create the HMAC signature. OAuth requires a consumer credential, and optionally a token credential, representing the end user.
Now that these core concepts are complete, I am working on high-level policy, like classes for generating signed URLs and authorization headers.
Notes
JUnit — expecting exceptions in scala
Assuming JUnit 4.x, a test can expect an exception using the test annotation:
Java:
@Test(expected=IllegalArgumentException.class) public void ExampleThrowsException(){ throw new IllegalArgumentException(); }
This needs to be modified for scala:
Scala:
@Test { val expected=classOf[IllegalArgumentException] } def ExampleThrowsException { throw new IllegalArgumentException }
The reason for it is outlined here in the Java annotations section on named parameters.
Here is the documentation for scala annotations. Seealso: the documentation for scala 2.7.3 (includes dbc).
Closures and return
The return statement immediately returns from the current method, even if you’re within a closure. Omit return in this case — return is optional anyway.
When to use semicolon line terminator
Never — apart from:
- When a method returns Unit (equivalent to void) and you aren’t using return keyword. [TBD: Add example].
How to use blocks
var count = 1
times(2) { println("Printed " + count + " times")}
protected def times(count : Int)(block : => Unit) = {
1.to(count).foreach((_) => block)
}
Seealso: some executable examples on github
References
- Scala oauth lib (Github)
- OAuth specification
- Scala 2.7.3 documentation
- OAuth ruby gem
- OAuth test client
ALPHA, DIGIT, '-', '.', '_', '~'
HTTP Proxy caching
My next project involves deliverying files via HTTP, and as part of the optimization we are going to implement a proxy cache. The aim of this is to reduce computation by resources.
HTTP response caching
The HTTP protocol provides a number of cache control mechanisms.
[RFC2616] The basic cache mechanisms in HTTP/1.1 (server-specified expiration times and validators) are implicit directives to caches. In some cases, a server or client might need to provide explicit directives to the HTTP caches. We use the Cache-Control header for this purpose.
The Cache-Control header allows a client or server to transmit a variety of directives in either requests or responses. These directives typically override the default caching algorithms. As a general rule, if there is any apparent conflict between header values, the most restrictive interpretation is applied (that is, the one that is most likely to preserve semantic transparency). However,
in some cases, cache-control directives are explicitly specified as weakening the approximation of semantic transparency (for example, “max-stale” or “public”).
Cache-control headers are a mechanism for supplying hints to servers and end-user applications concerning how resources should be validated, revalidated and cached.
Example
Here’s the response without a proxy:
HTTP/1.0 200 OK Date: Sat, 11 Jul 2009 12:05:07 GMT Server: Apache/2.0.52 (Red Hat) Cache-Control: max-age=315360000 Expires: Mon, 28 Jul 2014 23:30:00 GMT Last-Modified: Sun, 07 Jun 2009 19:53:21 GMT Accept-Ranges: bytes Content-Length: 134318 Content-Type: image/jpeg Age: 7 X-Cache: HIT from photocache413.flickr.ac4.yahoo.com X-Cache-Lookup: HIT from photocache413.flickr.ac4.yahoo.com:81 X-Cache: MISS from photocache427.flickr.ac4.yahoo.com X-Cache-Lookup: MISS from photocache427.flickr.ac4.yahoo.com:80 Via: 1.1 photocache413.flickr.ac4.yahoo.com:81 (squid/2.7.STABLE6), 1.0 photocache427.flickr.ac4.yahoo.com:80 (squid/2.7.STABLE6) Connection: close
And here it is a local proxy:
HTTP/1.0 200 OK Date: Sat, 11 Jul 2009 16:14:33 GMT Server: Apache/2.0.52 (Red Hat) Cache-Control: max-age=315360000 Expires: Mon, 28 Jul 2014 23:30:00 GMT Last-Modified: Sun, 07 Jun 2009 19:53:21 GMT Accept-Ranges: bytes Content-Length: 134318 Content-Type: image/jpeg X-Cache: HIT from photocache413.flickr.ac4.yahoo.com X-Cache-Lookup: HIT from photocache413.flickr.ac4.yahoo.com:81 X-Cache: MISS from photocache427.flickr.ac4.yahoo.com X-Cache-Lookup: MISS from photocache427.flickr.ac4.yahoo.com:80 X-Cache: MISS from 68e99101007e4d9 X-Cache-Lookup: HIT from 68e99101007e4d9:3128 Via: 1.1 photocache413.flickr.ac4.yahoo.com:81 (squid/2.7.STABLE6), 1.0 photocache427.flickr.ac4.yahoo.com:80 (squid/2.7.STABLE6), 1.0 68e99101007e4d9:3128 (squid/2.7.STABLE6) Connection: keep-alive Proxy-Connection: keep-alive
Differences:
- The first one has an
[Age header]: This represents a cache’s estimate of the time in seconds since the response was generated by the origin server — it means how long it’s been cached for. [TBD: Why is this missing when I add my local cache? Should it be forwarded, or not?] - The second one has an additional cache lookup representing the local proxy we’ve added.
Cache control headers
Cache-control
The Cache-Control general-header field is used to specify directives that MUST be obeyed by all caching mechanisms along the request/response chain. The directives specify behavior intended to prevent caches from adversely interfering with the request or response.
There are both request and reply directives.
Squid config
To strip out all the comments:
grep -P '^(\w)|^(#[\s]+TAG:)' squid.conf.default > squid.conf
This makes squid.conf easier to workwith. By default there is already a backup of this file called squid.conf.default.
Minimum config
To allow local machine to access local proxy, add the following:
acl localnet src 127.0.0.1/32
If this one is not added, then all requests from localhost will be denied.
There are other recommendations also in the QUICKSTART file.
Debugging access control lists
You can add a debug setting to conf file:
debug_options 28,9
This switches on debug level 9 (most verbose) for section 28, Access Control. For the full set of available sections, see /docs/debug-sections.txt.
Cache inspection
You can see what’s going on using the logs generated. For example, to look at cache hits, switch debugging on for section 12.
The access log records all cache activity.
TIP: Ensure your client is not sending Pragma:no-cache (curl does this by default), otherwise you’ll see lots of TCP_CLIENT_REFRESH_MISS in your access log.
TCP_MEM_HIT: A valid copy of the requested object was in the cache and it was in memory, thus avoiding disk accesses. This is like TCP_HIT, but the object was found in memory — TCP_HIT means disk access was required.
The store log is also interesting, it gives the status of stored objects. It looks like entries only recorded here when items are added or removed, i.e., cache hits will not show up.
Enries are tagged with one of :
- SWAPIN (swapped into memory from disk).
- SWAPOUT (saved to disk).
- RELEASE (removed from cache).
Caching ranges
Check your store log after making a fresh range get — you may see:
RELEASE -1 FFFFFFFF
This means the object was not cachable. [TBD: Why is this the case here?]
Problems testing range requests using curl
I was finding that my range header was being passed on to origin server even though I had set
range_offset_limit -1
Which forces squid to request the entire file and do the range itself. By turning on debugging for section 64, I could see it being forwarded.
Troubleshooting Squid
Excess data
If you get messages about excess data in your cache log:
Excess data from "GET http://www.example-domain/resource.html"
it’s likely that your origin server is sending more bytes than specified by the content length header. If you have control over the origin server, then ensure the write loop is not writing empty buffers — an easy mistake to make.
Squid v2.7 for Windows and range requests
Can’t seem to get cold range requests to cache. Is this a bug with the windows version? Here is an excellent article describing a possible workaround.
Debugging range requests
Output info to cache log using filter debug sections:
- 11 — Hypertext Transfer Protocol (HTTP)
- 12 — Internet Cache Protocol
- 17 — Request Forwarding
- 64 — HTTP Range Header
- 66 — HTTP Header Tools
- 74 — HTTP Message
debug_options 11,9 12,9 17,9 64,9 66,9 74,9
This will show you if range headers are being forwarded or not. For example, this line shows that we_do_ranges is being set to false:
httpBuildRequestHeader: range specs: 01597430, cachable: 1; we_do_ranges: 0
Even though this is from a range request, the range header is still being forwarded. Had we_do_ranges evaluated to 1, the range header would not have been forwarded. Squid is not supposed to forward the range header if range_offset_limit is set to -1.
Squid will not start — abnormal program termination
If you encounter this error on start, it may be because you have used a port that is in use. Run netstat to check.
For example, if squid is configured for port 3128:
$ netstat | grep [3128]
Change port using the http_port config setting.