Amazon S3, Ruby and Rails slides
published 2007-12-05
The slides from the talk are here. (Yes, they're hosted on S3).
There are two points in the presentation where I switched to a different window.
At the 'S3SH DEMO' slide, I did some live coding showing how you can work with S3 using s3sh. It basically followed the script shown in 's3sh demo script' below, so read that part when you see the 'S3SH DEMO' slide.
At the 'Example: S3Syncer' slide, I switched over to textmate and showed the code for a simple script to synchronize a single directory to S3. I then demoed the script to show it working. So, when you see the 'Example; S3Syncer' slide, read the s3syncer code and s3syncer demo sections below.
s3sh demo script
Start up s3sh
$> s3sh
Create a bucket.
Show that you can create a bucket multiple times if you own it, but
trying to create a bucket that somebody else owns raises an error.
>> Bucket.create('spatten_s3demo') => true >> Bucket.create('spatten_s3demo') => true >> Bucket.create('test') AWS::S3::BucketAlreadyExists: The requested bucket name is not available. The bucket namespace is shared by all users of the system. Please select a different name and try again. from /usr/local/lib/ruby/gems/1.8/gems/aws-s3-0.4.0/bin/../lib/aws/s3/error.rb:38:in `raise' from /usr/local/lib/ruby/gems/1.8/gems/aws-s3-0.4.0/bin/../lib/aws/s3/base.rb:72:in `request' from /usr/local/lib/ruby/gems/1.8/gems/aws-s3-0.4.0/bin/../lib/aws/s3/base.rb:83:in `put' from /usr/local/lib/ruby/gems/1.8/gems/aws-s3-0.4.0/bin/../lib/aws/s3/bucket.rb:79:in `create' from (irb):3
You can save a bucket in a variable using Bucket.find
>> b = Bucket.find('spatten_s3demo') => #<AWS::S3::Bucket:0x14ae7b8 @attributes={"prefix"=>nil, "name"=>"spatten_s3demo", "marker"=>nil, "max_keys"=>1000, "is_truncated"=>false, "xmlns"=>"http://s3.amazonaws.com/doc/2006-03-01/"}, @object_cache=[]>
Create a text object
>> S3Object.store('test.txt', 'This is a test', 'spatten_s3demo') => #<AWS::S3::S3Object::Response:0x10830590 200 OK> >> b.objects => [#<AWS::S3::S3Object:0x10804170 '/spatten_s3demo/test.txt'>] >> pp b.objects[0].about {"last-modified"=>"Wed, 05 Dec 2007 19:56:49 GMT", "x-amz-id-2"=> "JACm9T+m9CgZhmj4q6q00OSGHgSyBVAbQ1cgRWGydYZLTKdhLc/IUZ+K7b/1snOc", "content-type"=>"text/plain", "etag"=>"\"ce114e4501d2f4e2dcea3e17b546f339\"", "date"=>"Wed, 05 Dec 2007 19:57:03 GMT", "x-amz-request-id"=>"CA170D2AA5DEB0C9", "server"=>"AmazonS3", "content-length"=>"14"} => nil >> b.objects[0].key => "test.txt" >> b.objects[0].value => "This is a test"
Create a binary object and show it in a browser
>> S3Object.store('vampire.jpg', File.open('vampire.jpg'), 'spatten_s3demo') => #<AWS::S3::S3Object::Response:0x10764700 200 OK>
Show the photo in browser
This doesn't work, as the file is only readable by me. Make it public readable and do it again.
>> S3Object.store('vampire.jpg', File.open('vampire.jpg'), 'spatten_s3demo', :access => :public_read) => #<AWS::S3::S3Object::Response:0x10747950 200 OK>
Show it in a browser again. It works this time.
Look at bucket.objects. We have to reload the bucket to show the new object.
>> b.objects => [#<AWS::S3::S3Object:0x10804170 '/spatten_s3demo/test.txt'>] >> b.objects(:reload) => [#<AWS::S3::S3Object:0x10708080 '/spatten_s3demo/test.txt'>, #<AWS::S3::S3Object:0x10708070 '/spatten_s3demo/vampire.jpg'>]
Hash access to bucket objects
>> b['vampire.jpg'] => #<AWS::S3::S3Object:0x10708070 '/spatten_s3demo/vampire.jpg'> >> vamp = b['vampire.jpg'] => #<AWS::S3::S3Object:0x10708070 '/spatten_s3demo/vampire.jpg'>
A look at metadata
>> vamp.content_type => "image/jpeg" >> vamp.size => 10817 >> vamp.metadata => {} >> vamp.metadata['subject'] = 'Claire' => "Claire" >> vamp.metadata['photographer'] = 'Nadine Inkster' => "Nadine Inkster" >> vamp.store => true
Storing the picture data in a variable
>> picdata = vamp.value => "\377\330\377\340\000\020JFIF\000\001\002\000.......
Downloading a picture by streaming it to an IO object.
>> File.open('vampire_downloaded.jpg', 'w') {|file| file.write(vamp.value)} => 10817 >> exit s3demo $>ls flowers.jpg vampire.jpg test.txt vampire_downloaded.jpg s3demo $>open vampire_downloaded.jpg s3demo $>
S3Syncer Code
Please note that this code is really only useful as an example of how to synchronize with S3.
It won't recurse directories and it dies a horrible death if there are any symlinked files in a directory.
If you are looking for something to synchronize directories, check out s3sync.rb.
#!/usr/bin/env ruby require 'digest/md5' require 'aws/s3' include AWS::S3 class S3Syncer attr_reader :local_files, :files_to_upload def initialize(directory, bucket_name) @directory = directory @bucket_name = bucket_name end def S3Syncer.sync(directory, bucket) syncer = S3Syncer.new(directory, bucket) syncer.get_local_files syncer.connect_to_s3 syncer.get_bucket syncer.select_files_to_upload syncer.sync end # This does not recurse directories. def get_local_files @local_files = Dir.entries(@directory) end def connect_to_s3 Base.establish_connection!( :access_key_id => ENV['AMAZON_ACCESS_KEY_ID'], :secret_access_key => ENV['AMAZON_SECRET_ACCESS_KEY'] ) raise "\nERROR: Connection not made or bad access key " + "or bad secret access key. Exiting" unless AWS::S3::Base.connected? end def get_bucket Bucket.create(@bucket_name) @bucket = Bucket.find(@bucket_name) end # Files should be uploaded if # The file doesn't exist in the bucket # OR # The MD5 hashes don't match def select_files_to_upload @files_to_upload = @local_files.select do |file| case when File.directory?(local_name(file)) false # Don't upload directories when !@bucket[file] true # Upload if file does not exist on S3 when @bucket[file].etag != Digest::MD5.hexdigest(File.read(local_name(file))) true # Upload if MD5 sums don't match else false # the MD5 matches and it exists already, so don't upload it end end end # This will choke on symlinked files def sync (puts "Directories are in sync"; return) if @files_to_upload.empty? @files_to_upload.each do |file| puts "#{file} ===> #{@bucket.name}:#{file}" S3Object.store(file, File.open(local_name(file), 'r'), @bucket_name) end end private def local_name(file) File.join(@directory, file) end end if __FILE__ == $0 S3Syncer.sync('/Users/Scott/versioned/spattendesign/presentations/s3-on-rails/s3demo', 'spatten_syncdemo') end
S3Syncer demo
Start with spatten_syncdemo bucket empty, and four files in the local directory.
Run the script
s3demo $>ls flowers.jpg vampire.jpg test.txt vampire_downloaded.jpg s3demo $>s3syncer flowers.jpg ===> spatten_syncdemo:flowers.jpg test.txt ===> spatten_syncdemo:test.txt vampire.jpg ===> spatten_syncdemo:vampire.jpg vampire_downloaded.jpg ===> spatten_syncdemo:vampire_downloaded.jpg
Run it again, it says there's no need to do anything
s3demo $>s3syncer Directories are in sync
Change a file locally and sync again
s3demo $> vi test.txt Make some changes using vi s3demo $>s3syncer test.txt ===> spatten_syncdemo:test.txt
Delete flower.jpg using the Firefox S3 Organizer and then sync again.
s3demo $>s3syncer flowers.jpg ===> spatten_syncdemo:flowers.jpg
So there you go, a quick intro to the wonders of Amazon S3.