Domain Mapping

Sun May 11 22:08:00 UTC 2008

I built MadPolitic to be an account-per-subdomain site. So when a user signs up for an account, they can choose a site name that gets prepended to the domain, e.g. mysite.madpolitic.com. Frequently, MadPolitic’s users will have registered their own domain name and want to map it to their MadPolitic site. MadPolitic is a publishing system for people or groups with a cause, so they might have domains like savethefarmersmarket.com or stopeatingthewhales.com. How do you map a domain name to a specific user site in your Rails app? Here’s how I did it.

Part I: Some Assembly Required

First of all, it’s worth noting that there is no way to fully automate this process. The user has to purchase their domain outside of your application through the registrar of their choice. They also have to use that registrar’s administration console to make some DNS configuration changes. Since I can’t do any of those things for them, I instead provide my users with detailed instructions on making the DNS changes for each of the more popular registrars. This is a bit of a hassle since these sites are beyond my control and can and do frequently tweak their menus or UI enough to render my instructions incorrect.

Buy a Domain

Let’s say we have a photosensitive user named Linda who is outraged at the city for installing street lights in her neighborhood. She created her community activism site at lindasmad.madpolitic.com but would rather people could find her at a more user-friendly URL like www.darkenmystreet.com.

Linda is also a big Danica Patrick fan, so she heads to GoDaddy and registers her new domain.

Configure DNS: the A record

To map darkenmystreet.com and www.darkenmystreet.com to her MadPolitic site, Linda has to make changes to some DNS records on her registrar’s name servers. A newly registered domain is typically set to use a pair of name servers under the control of the registrar. Most of the more popular registrars offer some version of a DNS control panel that lets users modify the DNS settings for their domain on those name servers. In the GoDaddy administration console, for instance, that feature is called Total DNS Control and MX Records.

First, she has to edit what is called an A record, also called an address record. The sole purpose of an address record is to map a domain name to an IP address. This is the function that most of us think of when we think of DNS servers. They translate domain names to IP addresses. When you register a new domain, an A record is typically created that points your domain to one of the registrars IP addresses.

Linda edits the A record and sets ‘Host’ to ‘darkenmystreet.com’ and ‘Points to’ to ‘72.249.74.216’, which is the MadPolitic IP address. In most DNS servers there is a shorthand for referring to the current domain: the ‘at’ sign (@). So for ‘Host’, Linda could also enter @.

Configure DNS: the CNAME record

Linda has one more change to make to the DNS records. She also wants to be sure that someone entering www.darkenmystreet.com winds up at her MadPolitic site. To do this, she needs to create a CNAME record, also called an alias. This record has two primary pieces of information: the alias name and the host it points to. Linda enters www for the alias name and @ for the ‘points to’ field. Sometimes this record is automatically created by the registrar and so nothing needs to be done.

Once all the DNS information for her new domain propagates out through the Internet, both darkenmystreet.com and www.darkenmystreet.com will resolve to the MadPolitic IP address.

Tell MadPolitic about the new domain

The final step for Linda is to log into her MadPolitic account and register her new domain name with my Rails app. I’ll use this to map requests coming from the darkenmystreet.com host to Linda’s MadPolitic site.

Part II: About That Rails App

That takes care of the first half of the equation: requests for the mapped domain are getting routed to my server. At this point, everything else is under my control. I have an apache virtual host set up to catch requests for port 80 from any host and proxy them to my Rails app. In the app, I have a before_filter in ApplicationController called set_site with this logic

...
if request.host.ends_with? 'madpolitic.com'
  # the request is for a site on the madpolitic domain with no domain mapping
  sitename = request.subdomains(tld_length = 1)[0]
  session[:site] = Site.find_by_sitename(sitename).id
else
  # the host is not *.madpolitic.com so see if the host maps to a madpolitic site
  # I actually cache a mapping of mapped_domain => sitename so I don' t hit the db each time; removed for this example
  session[:site] = Site.find_by_mapped_domain(request.host)
end
if !session[:site]
  logger.error "[#{Time.now.utc.strftime('%m-%d-%Y %H%:%M:%S')}]: Failed to locate site based on host of #{request.host}."
  redirect_to 'http://www.madpolitic.com/500.html', :status => 500
  return false
end
...

This method checks request.host to see if it ends in .madpolitic.com and handles it like a typical subdomain-based request and serves up the correct site accordingly. If the host does not end in .madpolitic.com then it might be a mapped domain and I try pulling a site from the database using the mapped domain.

Since Linda registered her mapped domain with my Rails app, the find is successful. If the find fails, then it’s possible someone followed all the directions, but failed to go to the MadPolitic administration console and enter the mapped domain. When I don’t find a site, I can serve an oops page and offer some helpful troubleshooting suggestions.

Epilogue: Follow That Request

Now let’s meet Robert. He lives on Linda’s street and has been infatuated with her for months. When he heard about her campaign to have the street lights removed, he became an impassioned supporter of her cause. He gets the URL from a flyer in the local coffee shop and logs on to sign the petition.

When Robert enters http://www.darkenmystreet.com into his browser and hits enter, here’s what happens (assuming he’s never hit that domain before).
  1. His browser sends a request out to one of the DNS servers of Robert’s ISP.
  2. That server (called the Resolver in this scheme) has never had a request for www.darkenmystreet.com before, so it doesn’t know the IP. It passes the request on to one of the 13 root name servers.
  3. That server replies to the Resolver saying “I don’t know about that domain, but this root name server should.”
  4. The Resolver then contacts that root name server with the request. That root server has a record mapping the domain to a name server and returns the address of that name server to the Resolver.
  5. Now the Resolver contacts that name server. This is the name server run by GoDaddy that Linda configured when she set the A record and CNAME record.
  6. Those record entries translate www.darkenmystreet.com to the MadPolitic IP address and return it to the Resolver.
  7. The Resolver caches the information for future requests and returns the IP to the browser.
  8. The browser sends its request directly to the MadPolitic IP.
  9. My Apache server is listening on that IP on port 80, and intercepts the request. The host in the request header matches up with a virtual server I have configured and Apache proxies the request on to Mongrel and my Rails app.
  10. Rails serves up the page Robert was looking for.
  11. Robert comments on one of Linda’s blog postings with an awkwardly-worded poem wherein he likens his love for Linda to a streetlight.

0 comments | Tags: domain mapping madpolitic subdomains

!transactional fixtures

Thu May 01 20:13:00 UTC 2008

I am referring to transactional fixtures, and I had a mistaken understanding of how they worked. For my own notes, here’s what I learned.

By default, TestCase and RSpec use transactional fixtures. This is set in the respective helper file (test/test_helper.rb or spec/spec_helper.rb). Transactional fixtures is a terrible name for this feature, because the fixtures are not transactional. What transactional fixtures actually means is that each test method is executed in a database transaction which is rolled back when the test is finished.

Here’s what happens when you have a test class with several test methods and one fixture (let’s say :users) specified

  • all rows are deleted from the users table
  • an insert is executed for each user listed in the :users fixture
  • for each test method in the test case
    • a transaction is started
    • the test is run
    • the transaction is rolled back

This means that when the class is finished executing, all the fixture data that was loaded is left in the database.

There’s a gotcha lurking here, which is what bit me. Let’s say you have two models as follows

class User < ActiveRecord::Base
  has_many :articles
end

class Zebra < ActiveRecord::Base
  belongs_to :user
end

class UserTest < Test::Unit::TestCase
  fixtures :users
  # test methods here
end

class ZebraTest < Test::Unit::TestCase
  fixtures :zebras
  # test methods here
end

When you run rake test:models, all your zebra tests pass. However, if you rake db:test:prepare and run ruby test/unit/zebra_test.rb, suddenly tests are failing. This is because when you run the entire suite, the User test case is running first and its fixture, :users, is getting loaded into the database. When the suite gets to the Zebra test case, the test methods that rely on rows in the users table find the data that was left over from the User test case and so they work. However, when you run the Zebra test case in isolation on a fresh test database, that user data is not loaded, so the tests fail.

This has led to a new best practice for me when writing tests. I always make sure I’m running the TestCase against a clean database while I’m writing it. I do this by frequently running rake db:test:prepare. That way I know that the TestCase has the fixture data it needs to run on its own and is not dependent on data that might be left over from the fixtuers of other tests.

0 comments | Tags: rspec testing fixtures

:counter_cache no workie

Sat Apr 19 11:17:00 UTC 2008

I ran into a gotcha trying to use the :counter_cache option on a belongs_to association today. In short, the :counter_cache option can be used on the belongs_to end of a one-to-many association to improve the performance of querying the size of the association. As a quick example

class Dog < ActiveRecord::Base
  has_many :fleas
end

class Flea < ActiveRecord::Base
  belongs_to :dog
end

charlie = Dog.create(:name => "charlie")
charlie.fleas.create(:name => "fleabert")
charlie.reload
charlie.fleas.size
When this code is run from script/console, the development log will show you that in order to get the size of the fleas association on charlie, this sql was run
SELECT count(*) AS count_all FROM `fleas` WHERE (fleas.dog_id = 1)

If you use :counter_cache, however, you can save yourself that extra db query when trying to determine the size of an association.

The Rails Framework docs explain the usage of the option, and in its simple form, all you need to do is add :counter_cache => true to the belongs_to association. So in the above example,

class Flea < ActiveRecord::Base
  belongs_to :dog, :counter_cache => true
end

and add a column named #{table_name}_count to the associate class (so add fleas_count to dog, in this case) of type integer.

I did this and then reran the code where I create a dog and an associated flea, reload the dog, then ask for the size of the flea association. My development log was still showing a SELECT count(*) statement being executed.

I studied the log a bit more and found how ActiveRecord was trying to manage the counter cache. After creating the flea and adding it to the fleas association, this query is run
UPDATE dogs SET `fleas_count` = `fleas_count` + 1 WHERE (`id` = 1)

I checked the structure of the dogs table, and noticed that the default value for the fleas_count count column was NULL. Obviously trying to add 1 to NULL wasn’t going to be successful.

The answer, obviously, was to update my migration and specify a default value of 0 for the counter_cache column
t.integer :fleas_count, :default => 0

The counter cache then worked as advertised.

0 comments | Tags: rails

Bandwidth Tracking

Wed Apr 16 22:04:00 UTC 2008

Like a lot of web applications, MadPolitic has different levels of user accounts. One of the account differentiators is bandwidth allowance; higher-level accounts have an increased bandwidth limit. To implement this, I needed a way to track bandwidth usage per account. Here’s how I did it using Apache.

Use CustomLog and cronolog to create and rotate a custom bandwidth log

The standard Apache access.log was missing several things I needed in order to track bandwidth.
  • the format did not include the host from the request header
  • it also did not include the number of bytes Apache sent back across the network to the client in response to each request
  • the log rotation didn’t give me a clean history of log files by day

To remedy this, I created a CustomLog and used cronolog to get some more advanced log rotation. Apache provides log rotation capability out of the box, but the Apache docs themselves recommend cronolog for more flexible logging.

Here’s the CustomLog I created, from my httpd.conf file

CustomLog "|/usr/local/sbin/cronolog -l /etc/httpd/logs/bandwidth_today -P /etc/httpd/logs/bandwidth_yesterday /etc/httpd/logs/%Y/%m/%d/bandwidth.log" bandwidth env=!dontlog
Apache passes most of the work off to cronolog, which does a few things
  • creates a symbolic link called bandwidth_today to the current bandwidth log file
  • creates a symbolic link called bandwidth_yesterday to the previous day’s bandwidth log file
  • defines a spec that indicates how I want the logfiles rotated and stored, in this case, in directories based on the month, day and year

After the piped invocation of cronolog, the CustomLog directive has two more parameters. The first, bandwidth, references a LogFormat I created. More on that in a moment. The last parameter enables the ability to not log certain requests. For example, if you have Apache serving up other sites or content that you do not want logged, you can add lines like this to your httpd.conf file

SetEnvIf Host "myblog\.mydomain\.com" dontlog

and that final parameter to CustomLog makes sure requests to myblog.mydomain.com don’t wind up in the log.

Create a custom LogFormat

Recall that bandwidth parameter I passed to the CustomLog directive. It should reference a LogFormat directive that tells Apache how I want each line of the log file to look. By default, the Apache access log is missing the two key pieces of information I needed: the host from the request header, and the number of bytes that Apache sent back across the network to the client.

Logging the host was one of the keys to making all this work for my application. That’s because I use a subdomain-per-account approach, so each account gets a unique subdomain like bob.madpolitic.com. Capturing that host information in the log files along with the bytes transferred would give me all the raw data I needed to map bandwidth usage to accounts.

I started by adding those bits into the format and came up with the following

LogFormat "%t %{Host}i %O %h \"%r\" %>s" bandwidth

You can reference the Apache docs for the LogFormat directive for more detail, but in brief, this line tells Apache to log the time (%t), the contents of the Host: header in the request (%{Host}i), the bytes sent over the network to the client (%O), the remote host (%h), the first line of the HTTP request, quoted (\"%r\"), and the HTTP status of the response (%>s). Finally, it gives this format a name, bandwidth.

I now had a log file being written out in a format that I specified that included, most importantly, the host and the number of bytes sent to the client for each request. That log file was being rotated daily and previous logs were stored in a directory structure based on the date. I also had some handy symbolic links to today’s and yesterday’s bandwidth log files.

Make Apache write your Ruby code

This worked great. The plan was to fill a log with this information, and have one log file per day. Then I could use a log analyzer and some Ruby to parse through all those log files, compile the bandwidth statistics, and write it all into my application’s database so that my Rails app could get to it to do things like show the user how much bandwidth they’ve used. After trying it out for a little while, however, I came up with another idea.

Why not have Apache just write the Ruby code for me? I could drop the log analyzer and basically just execute the log file like a Ruby script and have it give me the data in the structure I wanted: a Ruby hash mapping hosts to bytes.

I went back to work on my log format and came up with this

LogFormat "bw[\"%{Host}i\"] = (bw[\"%{Host}i\"]) ? bw[\"%{Host}i\"] = bw[\"%{Host}i\"] + %O : bw[\"%{Host}i\"] = %O" bandwidth

This gave me logs with lines like this

bw["bob.madpolitic.com"] = (bw["bob.madpolitic.com"]) ? bw["bob.madpolitic.com"] = bw["bob.madpolitic.com"] + 123 : bw["bob.madpolitic.com"] = 123
bw["frank.madpolitic.com"] = (bw["frank.madpolitic.com"]) ? bw["frank.madpolitic.com"] = bw["frank.madpolitic.com"] + 3442 : bw["frank.madpolitic.com"] = 3442
bw["bob.madpolitic.com"] = (bw["bob.madpolitic.com"]) ? bw["bob.madpolitic.com"] = bw["bob.madpolitic.com"] + 33 : bw["bob.madpolitic.com"] = 33

Now I had a file full of Ruby code that kept a hash called bw which mapped the host to the number of bytes transferred. Each line would check to see if that host already existed as a key in the hash, and if it did, add the bytes to the total already in the hash. Otherwise it would create a new entry in the hash, host => bytes.

Get the data from the hash into the application database

Next I wrote a Ruby script that gets invoked by cron once a day just after midnight. The script does the following
  1. creates a new empty hash called bw
  2. uses the bandwidth_yesterday symbolic link to read in yesterday’s log file all at once to a string. I process bandwidth totals once a day, and remember that this script is being run just after midnight, so I want to process the previous day’s log
  3. sends the eval message to Kernel and passes the string that contains the log file along
  4. when this is finished, my bw hash is now populated with host => bytes entries
  5. the script iterates through each entry in the hash, and for each
  6. uses the key (the host) to find an account in my application. Remember that each account gets a unique subdomain, so the host bob.madpolitic.com would map to bob’s account
  7. looks to see if there is already a row for that account, for the current month, with a bandwidth total and creates one if there is not. This would happen the first time that account is ever used, or at the start of each month
  8. adds the value from the current hash entry, which is the bytes transferred for that host for the previous day, to the existing value

When it’s all finished I have a table that is populated with the number of bytes transferred for each account, for each month. The table looks like this

create_table "web_stats", :force => true do |t|
  t.column "account_id",   :integer
  t.column "month",         :string # stored in a %m%Y format, like 042008
  t.column "bytes",           :bigint
end

I’ve also got all of the historical log files still sitting on my file system (getting backed up once a day) in the event I need to recreate the data or mash it up some other way.

0 comments | Tags: apache bandwidth tracking rails ruby

Callback Recursion

Tue Apr 15 23:05:00 UTC 2008

I made a clumsy mistake with ActiveRecord callbacks recently. I fixed my mistake, but was left wondering if there is a better way to do it.

When I save a specific ActiveRecord object, I need to iterate through all the other instances of that model and flip a flag, saving each one. For example, let’s say I have a piece of functionality where users can create multiple tasks, but only one task can be active at any one time. So I have a model called Task and it has an attribute called active. When a user marks a task as active, I need to go through all their other tasks and make sure they are marked inactive.

Initially, I tried to do this by implementing an after_save callback on the model.

after_save :deactivate_other_tasks

def deactivate_other_tasks
  user.tasks.each do |task|
    task.update_attribute(:active, false)
  end
end

The problem might be obvious to most, however, I had to write it, run it, and watch it fail to figure out the obvious. As I iterate through each task, every time I call update_attribute, I’m triggering the same after_save callback. This little recursion journey continues until the memory wall is hit.

So I scrapped the callback approach and instead overrode Task#active=

def active=(arg)
  write_attribute(:active, arg) # go ahead and change the attribute of this object
  if (arg =~ /^true$/i) # case-insensitive match of just the string true
    user.tasks.select { |x| x != self }.each { |x| x.update_attribute(:active, false) }
  end
end

This works great. The problem I have with it is that somewhere down the road, it would be easy for someone not familiar with this model to add a before_save or after_save callback for some entirely different reason, and fall into the same recursion trap. I can add some comments at the top of the class

# Dear lord, whatever you do, do NOT add any callbacks triggered on
# save. You cannot comprehend the consequences.

But that’s just silly.

Is there a Railsy way of doing this? Is there a way to save an ActiveRecord object and selectively indicate that you want to skip all the callbacks in that one instance?

0 comments | Tags: rails callbacks

Who's Hungry?

Mon Apr 14 10:44:00 UTC 2008

I put gates at the top and bottom of my stairs, child-proofed all the cabinets and all the drawers. Guess I forgot the fridge.

0 comments | Tags: alex

Blocks and Precedence

Mon Apr 14 09:48:00 UTC 2008

There are two different forms for passing a block to a method in Ruby, one with braces, and one with do/end. I used to think they were equivalent, but recently learned that there is a subtle difference having to do with precedence. As an example

def block_test(arg1)
    "I received '#{arg1}' and " << (block_given? ? "a block which yielded '#{yield}'" : "no block")
end

def speak
    "I've been told to say " << (block_given? ? yield : "nothing")
end

Looking at the brace form of passing blocks to methods, you can see that whether you surround your arguments with parentheses changes who gets the block

block_test speak { "hello" } 
# => "I received 'I've been told to say hello' and no block"

block_test(speak) { "hello" } 
# => "I received 'I've been told to say nothing' and a block which yielded 'hello'"

This is fairly intuitive. However, if you use the do/end form for the block, it gets passed as an argument to block_test whether you use parentheses or not

block_test speak do
    "hello"
end
# => "I received 'I've been told to say nothing' and a block which yielded 'hello'"

block_test(speak) do
    "hello"
end
# => "I received 'I've been told to say nothing' and a block which yielded 'hello'"

0 comments | Tags: blocks ruby

A Little irb Context

Fri Apr 11 08:00:00 UTC 2008

When you’re sitting at your irb prompt, sometimes it can get a little confusing trying to remember the context in which all those commands you’re entering are executed. We know that anything entered at the prompt in irb is executed in some context, but it’s easy to forget what it is since it’s loaded for you and is not explicit.

We know that if we ask irb about itself, it tells us its class is Object


>> self.class
=> Object

So if we take that together with two facts from the pickaxe book:

"class and module definitions are executable code", and 
"a class definition is executed with that class as the current object" 

we can begin to cobble together an answer. Great! What was the question again?

The question: what is the context of that irb prompt?
The answer: you are entering code into the class definition of the Object class, and it is being executed in the context of an instance of the Class for Object named ‘main’. Clear? I thought so.

Sometimes I find it useful to visualize the context of irb using this example:
class Object

    # all of Object's methods defined here

    # Now imagine that your irb prompt is sitting right here, 
    # in this gap in Object's class, right before the final end, 
    # flashing at you, waiting for you to do great things.

    # >>

    # And every time you hit the Return key, the entire class 
    # definition for Object is executed again, with whatever 
    # command(s) you entered at the prompt appended to the 
    # end of the class definition.
  end
This means you can do things like
class Object

    # all of Object's methods defined here

    # >>
    private 
    def foo(arg)
      puts "#{arg}"
    end

    foo("bar")
   # And you'll see "bar" output.

    # Now define a class to play around with
    class Fool
      def touch
        "slap!"
      end
    end

    fool = Fool.new
    puts fool.touch

    # And you'll see "slap!"
  end

Obviously this is a gross over-simplification of what irb is. But maybe this little Gedankenexperiment will help contextualize your irb-ing.

0 comments | Tags: irb ruby

That First Line of Ruby

Wed Apr 09 15:37:00 UTC 2008

Probably the very first piece of ruby code most of us wrote was some invocation of the puts method. In those early days, I remember grokking enough to know that puts added a newline and print did not and then I moved on.

Recently, I was watching a coworker give a demonstration to our team of the facade pattern using ruby. His example worked great, except that the output wasn’t formatted the way he expected. He had overridden the to_s method in one of his classes and was using Kernel#p to output the object.

class Grouper
attr_accessor :taste, :color, :price
  def initialize(taste, color, price)
    @taste, @color, @price = taste, color, price
  end
  def to_s
    "Taste: #{taste}, Color: #{color}, Price: #{price}"
  end
end

p Grouper.new("good", "white", "cheap")
which yielded
#<Grouper:0x314494 @taste="good", @price="cheap", @color="white">

He was expecting his overridden to_s method to have been invoked, and it wasn’t.

I decided it might be a good idea to go back and dig a little deeper into some of the ways ruby provides for getting output to $stdout.

To start, let’s take every newbie’s favorite method, puts. So hearkening back to those heady early moments with Ruby, I fired up irb and invoked puts:


>> puts "Oi" 
Oi
=> nil

What just happened? Well, we know that the puts message must have a receiver, even though it’s not apparent from this invocation through irb. To track down what receiver I was sending the puts message to, I started with ruby info:

ri puts

which didn’t get me very far:

IO#puts, IRB::Locale#puts, IRB::Notifier::AbstructNotifier#puts,
IRB::OutputMethod#puts, Kernel#puts, Net::Telnet#puts,
Net::WriteAdapter#puts,
Net::SSH::Service::Process::OpenManager#puts,
Net::SSH::Service::Process::POpen3Manager::SSHStdinPipe#puts,
StringIO#puts, XMP#puts, XMP::StringInputMethod#puts,
Zlib::GzipWriter#puts, TMail::Decoder#puts, TMail::Decoder#puts,
TMail::Encoder#puts, TMail::Encoder#puts

So which one of these puts was being invoked? If you took the time to read the list, there’s really only one likely candidate, but still, let’s try a more direct approach. Back in irb


>> self.class
=> Object

The “hidden” receiver of the puts message is the Object class. If we check Object’s instance methods


>> Object.instance_methods.include?("puts")
=> false

we see that puts isn’t one of them. Let’s check its privates


>> Object.private_methods.include?("puts")
=> true

and there it is. But Object#puts wasn’t one of the results of our ruby info request. To be precise, puts is actually a private instance method declared by the Kernel module. Remember that the Object class doesn’t define any instance methods; it gets them all from mixing in the Kernel module. This misled me for some time because the docs show lots of methods for Object. Then I bothered to read the class documentation for Object, which points out

    
"Although the instance methods of +Object+ are defined by the +Kernel+ module, we have chosen to document them here for clarity." 

Had the opposite effect on me, but oh well. Nonetheless, now we know that the method we’ve been calling from irb is Kernel#puts. Back to ri

------------------------------------------------------------ Kernel#puts
     puts(obj, ...)    => nil
------------------------------------------------------------------------
     Equivalent to

         $stdout.puts(obj, ...)
$stdout is a pre-defined variable referencing an instance of the class IO. So at long last, with one final stroke of ri, we find the method we’ve been invoking all this time
---------------------------------------------------------------- IO#puts
     ios.puts(obj, ...)    => nil
------------------------------------------------------------------------
     Writes the given objects to _ios_ as with +IO#print+. Writes a
     record separator (typically a newline) after any that do not
     already end with a newline sequence. If called with an array
     argument, writes each element on a new line. If called without
     arguments, outputs a single record separator.

And all it does is write out what you pass it, adding the slightest bit of formatting in the form of new lines after each object. (The record separator used is a bit of a mystery.) What if you pass it something other than a string? Then that object’s to_s method is used to get the string representation of the object.

Let’s move on to another popular method, print. As I said at the beginning, I used to just think of print as being like puts but without the new line. That is correct, but is not the only difference. When print is invoked without an explicit receiver in irb, it’s the IO class that receives the print message, just like puts. In the case of print, however, there is some extra logic that gets applied to the formatting. print will add the value of the output field separator variable ($,) after each field, and append the value of the output record separator variable ($\) to the end of the output.

So typically those variables are nil

>> print "one", "two", "three" 
onetwothree=> nil

Nothing between the fields, and nothing at the end of the output. If you give those variables values


>> $\ = " [louise] " 
>> $, = " [mabel] " 
>> print "one", "two", "three" 
one [mabel] two [mabel] three [louise] => nil

To finish, we’ll look at Object#inspect and Kernel#p. The key difference between Object#inspect and the other methods we’re looking at is that Object#inspect does not write anything to $stdout and returns a String. The other methods all write to $stdout and return nil.

The common use case for Object#inspect is seeing what the current state of the object is, i.e. what all its instance variables are set to. By default, inspect returns a string comprising the object’s class and id, and the state of its variables.


>> g = Grouper.new("awful", "white", "cheap")
>> g.inspect
=> "#<Grouper:0x3503f4 @color=\"white\", @taste=\"good\", @price=\"cheap\">" 

Object#inspect does some simple formatting, like surrounding strings with quotes. One thing to note is that Object#inspect does not bother itself with Grouper’s to_s method.

Kernel#p takes one or more objects as arguments and just iterates through them, calling Object#inspect for each, and sending the output to $stdout.

>> p({"one" => 1, "two" => 2, "three" => 3}, "Frank", ["green", "blue"], 5, Object.new)
{"three"=>3, "two"=>2, "one"=>1}
"Frank" 
["green", "blue"]
5
#<Object:0x81790>
=> nil
So back to my coworkers’ example with the Grouper class. He had used the Kernel#p method to print out a string representation of his object. Kernel#p just sends the inspect message to Grouper and writes the returned string out to $stdout. At no point does the to_s message get sent to Grouper. This, however, would have given him what he wanted

>> puts Grouper.new("good", "white", "cheap")
Taste: good, Color: white, Price: cheap
=> nil
Here’s a summary of the four methods we looked at
                       writes to $stdout | invokes obj.to_s | returns
IO#puts(obj, ...)             YES        |        YES       |   nil
IO#print(obj, ...)            YES        |        YES       |   nil
Object#inspect                NO         |        NO        |   String
Kernel#p(obj, ...)            YES        |        NO        |   nil

If these don’t give you the control or formatting you need, then take a look at the PrettyPrint class and its Modules.

0 comments | Tags: inspect p print puts ruby