Ruby Questions: May 2012

Thursday, May 31, 2012

What are ruby modules and how do they work?

A ruby module is basically a container for constants and methods. It is sort of a mix between structures in languages like C and C++ and interfaces in Java (because it can be mixed in to simulate multiple inheritance - more on that later). Using a ruby module works very well for resolving namespace conflicts and promoting a well abstracted program design.

To define a ruby module, you type module MyModule. Modules must begin with an uppercase letter, I believe. Just as with a class, you end the module definition with the end keyword. Inside a module, you can define constants (which must also start with a capital letter) and functions with the def statement. An important note: you can define constants in MyModule by just writing CONSTANT = 5 or the like, but you must define methods by prefixing their names with the module name and a dot as in MyModule.function.

When you have defined a module, you can access its constants with the :: operator as in MyModule::CONSTANT and its functions with MyModule.function (the . operator). Here's a full example (with albeit stupid outputs):

module OneThing

VALUE = 24

def OneThing.function

puts 'this is the first thing'

end

module TwoThing

VALUE = 25

def TwoThing.function

puts 'this is the second thing'

end

puts OneThing::VALUE.to_s

OneThing.function

puts TwoThing::VALUE.to_s

TwoThing.function

This will output:

this is the first thing

this is the second thing

So that's how modules work in ruby. More on mixing in modules later when we talk about inheritance and multiple inheritance.

Sunday, May 27, 2012

How do you work with public and private in ruby?

In ruby, there are three levels of privacy that class methods can have. Public methods of a class can be called on any object of that class; protected methods can be called form inside the class definitions of the class or classes that extend it; private methods can only be called form inside the class which defines them (they can only be called on the implicit object self).

The simplest way to define levels of privacy for a method is to put the public, protected, or private keyword before it, as such:

private

def private_method

puts 'a public method called this private method'

end

Note that you do NOT need to specify an end statement to correspond to the private statement: any method defined after the private statement will be private unless otherwise specified with another permissions modifier statement.

You can also pass symbol arguments to the private, public and protected functions to make the corresponding methods private, public or protected. For example,

def private_method

puts 'a public method called this private method'

end

private :private_method

would produce the same result as the previous example. For me, it's much easier to call the private function with a symbol after the function definition. That was a function definition is sort of in two parts: the actual logic and the permission setting. Here's a test file that demonstrates private, protected, and public methods:

class PrivateTest

def initialize

@size = 5

end

def this_is_private x

'this should be private '+x

end

private :this_is_private

def this_is_public x

this_is_private x

end

def private_again

puts 'hello, a private method was called'

end

private :private_again

def protected_method

puts 'mustve been called from a class or subclass'

end

protected :protected_method

def call_protected

protected_method

end

def is_it_public?

puts 'yep its public'

end

class ProtectedTest < PrivateTest

end

puts 'commencing tests'

pvt = PrivateTest.new

ptc = ProtectedTest.new

# pvt.this_is_private

pvt.this_is_public 'hello' #works

# pvt.private_again

# pvt.protected_method

pvt.call_protected

pvt.is_it_public?

# ptc.this_is_private

ptc.this_is_public 'hello'

# ptc.private_again

# ptc.protected_method

ptc.call_protected

ptc.is_it_public?

puts 'all passed.'

The commented out tests cause an error, but the uncommented tests run without error. As far as I can tell, there is no way to have protected static variables.

So that's how public and private work in ruby.

Friday, May 25, 2012

How do you define classes in ruby?

To define a class in ruby, you simply have to type class MyClass. The naming convention of capitalizing class names is enforced in ruby, so you have to start class names with a capital letter. The class definition is terminated by and end statement.

The constructor of every class in ruby is called initialize. You can define a constructor by simply defining a function with that name, and any arguments you pass to MyClass.new will be passed to initialize. You do not have to specify a constructor if you don't want to; ruby will create one with no parameters that does nothing if you like. In this case, every field will initially be nil, if you try to access it.

Object fields in ruby are denoted by the enforced convention of the @ sign. For example, saying

class DListNode
def initialize
@item = 5
end
end

will, on the creation of a new DListNode object, set the object's size field to 5. You can define methods just as you defined the constructor, although obviously methods can have any name you like. Except they can't start with a capital letter. Or @.

Ruby has a very cool convention for writing functions to access object fields. There are three built in functions called attribute_accessor, attribute_reader and attribute_writer, which each take in a variable number of symbols and set permissions accordingly. For example, if you wanted to have the size of a DList (doubly linked list data structure) be accessible to any program that wanted to read it, but keep programs from changing it, you would say attribute_reader :size, conventionally before you define initialize. This means that some other program would be able to say my_dlist.size and it would return the correct value, but it would not be able to say my_dlist.size = 6.

If you wanted to make the size field totally public, you could say attribute_accessor :size instead, which would allow other programs to change my_dlist.size. If for some reason (I can't think of any at the moment) you only wanted other programs to have write access, you could say attribute_writer :size.

Note that this means, if you specified attribute_reader for a field, you do not have to refer to it with the @ prefix later on. However, if you did not specify read access, you still need to say @size, instead of just size. This is because, I believe, ruby looks at the object fields first when looking for a value of a name inside a class.

Here's a full (though not fully well encapsulated) implementation of a doubly linked list in ruby for you to look over.

class DListNode
attr_accessor :item, :nextone, :prevone
def initialize(item=nil, nextone=nil, prevone=nil)
@item = item
@next = nextone
@prevone = prevone
end
def to_s
if nextone.item == nil
return item.to_s unless item == nil
return 'nil'
end
return item.to_s + ', ' + nextone.to_s unless item == nil
return 'nil, ' + nextone.to_s
end
end

class DList
attr_reader :size
def initialize
@size = 0
@head = DListNode.new
@head.nextone = @head
@head.prevone = @head
end
def add item
node = DListNode.new item
node.prevone = @head.prevone
node.nextone = @head
node.prevone.nextone = node
node.nextone.prevone = node
@size += 1
end
def to_s
'['+@head.nextone.to_s+']'
end
end

So that's how to define classes in ruby.

Wednesday, May 23, 2012

How do you throw and catch exceptions in ruby?

To throw an exception (or Error) in ruby, you simply have to type something along the lines of raise 'you did something wrong' inside a begin and end block. This raises a RuntimeError. If you would like to raise some other error, such as a LocalJumpError or, for example, some other error object that you defined, you can instead say raise LocalJumpError 'you jumped around and you shouldn't have!'. You can raise an error without a message as in simply raise MyError, or you can even raise a RuntimeError with no message by just typing raise.

To catch an error (to do something when an error happens), you can say rescue. rescue with no arguments will simply catch any error, rescue RuntimeError will catch only RuntimeErrors, and rescure RuntimeError => e will catch only RuntimeErrors and provide you with a reference to the RuntimeError object which was raised, and which you can call methods on as specified in the ruby documentation. For example, e.message would return the string that was passed along with the raise clause.

The allegory to a finally clause in java or python is the ensure statement in ruby. Anything positioned after this statement but before the end of the begin...end block will always be executed, no matter what. Ruby does NOT operate by the same rules as java and python when it comes to returning things inside the begin...end block, and raises a lot of confusing LocalJumpErrors if you try to.

Here's a full example:

def raise_if_string x
raise RuntimeError, 'you tried to pass a string, didn't you?' unless x.class != String
end

begin
raise_if_string 5 #will do nothing
raise_if_string 'hello' #will throw RuntimeError
rescue RuntimeError => e
print e.message
ensure
puts 'this will always be printed. goodbye'
end

The output of this would be:

you tried to pass a string, didn't you?
this will always be printed. goodbye

So that's how you throw and catch exceptions in ruby.

Tuesday, May 22, 2012

What does the & mean in ruby?

The unary ampersand operator converts a block to a proc and a proc to a block. There are some functions, such as Array.map, which expect to be passed a block. You can pass these functions a proc instead with the & operator.

For example, say you have square = lambda {|x| x*x}. You can they write [1,2,3].map(&square), and that will return an array with elements 1, 4, and 9.

Conversely, you can use the & operator to pass a block where a proc would usually be required. Consider this function:

def apply_proc arg, &process
process.call(arg)
end

You can then say apply_proc('hello') {|x| puts x}, and it will work as expected: it will print hello. Important: parentheses are REQUIRED here. If you were just to say apply_proc 'hello' {|x| puts x} or apply_proc 'hello', {|x| puts x}, it would SyntaxError, because ruby would be unable to determine that the block was supposed to be the last argument (which seems to me to be kind of backwards, but that's how it is). Also, you cant have a function definition like def apply_proc &process, arg, because a block can only be passed as the last argument of the function. This also means, or course, that you can only pass one block to a function. If only blocks were first class objects.

One more thing, on how things are actually converted in ruby. When ruby sees an & sign, it checks to see if the operand of the & is a proc, and if it is not, it converts it to a proc before converting it to a block using the .to_proc method. In ruby 1.9, the & operator works with symbols - &:capitalize gives you the string method capitalize. You HAVE to be careful of this, though. :capitalize.to_proc.call('hello') functions exactly the same as 'hello'.capitalize - the symbol to_proc method accepts an argument and calls "itself" on that argument. AND, as if you weren't already confused enough, you CAN'T say &:capitalize.call('hello'), you can only pass a proc to a function with an ampersand notation, as in apply_proc('hello', &:capitalize). This is incredibly confusing and absurd to me, because the symbol is not a block, as it would seem that it should be. But at any rate, ruby is confusing.

So that's what the & means in ruby.

Monday, May 21, 2012

What does the ruby p function do?

The p function takes a variable number of arguments and prints how they would be displayed in the ruby interpreter. This is the same as the string that is returned by the obj.inspect method.

Example: you have an array a = [1,2,3,4] and a string s = 'hello'. a.inspect returns the string [1,2,3,4], and s.inspect returns the string "hello" (including the quotation marks). So p a,s would output:

[1,2,3,4]

"hello"

That's what the ruby p function does.

Sunday, May 20, 2012

What are here docs and how are they implemented in ruby?

A here document is a way to represent a multiline string. It's not a concept that's unique to ruby - it's also used in shell scripts, php, perl, and some other scripting languages. In ruby, regular strings with double or single quotes can span multiple lines, so there is almost never a need for here documents.

Of course, in true ruby style, there is an implementation available. To define a here doc, you type <<ID, where ID is some unique word, like EOF or MY_COOL_LIST. The convention, as I understand it, is to type this identifier in all caps. Then, use as many lines as you want to define the actual string. Ruby will stop defining the string when it encounters the identifier again, on its own line. So for example:

puts <<HEREDOC

Hello there!

This is a heredoc, which means

that if you print it, line breaks and spacing will be preserved.

HEREDOC

Will print:

Hello there!

This is a heredoc, which means

that if you print it, line breaks and spacing will be preserved.

A couple things to note: the text does not have to be specifically enclosed in the <<ID and ID; instead, saying <<ID specifies that a heredoc will be defined on the following lines. Also, if there are two heredocs defined on one line, e.g. as in func(<<DOC1, <<DOC2), then the second doc will begin to be defined on the line after the closing DOC1. Then you can treat the <<DOC1 declaration as an object, as in the above example. Example: you can do this (copied from Nicholas Evans on Jay Fields' blog):

array_of_long_pasted_in_strings = [<<FOO, <<BAR, <<BLATZ]
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
FOO
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum
BAR
I recently discovered that I can use multiple heredocs as parameters.
Isn't that neat? Because almost nothing needs to be escaped with heredocs, I prefer to use it when strings are pasted in from elsewhere. Because the syntax for using it inside parameter lists is so nice, I prefer to use it whenever a multiline string literal is being passed into something as an argument.
BLATZ

Another quick note: if you use <<ID to define heredocs, then the closing ID must be the first thing on its line, or it will be included in the multiline string. If you put in a dash, like <<-ID, the string definition will end whether there are spaces before the ending ID or not. Refer to the blog post linked above for a good example of this.

So why would you want to use heredocs instead of just regular multiline strings? The only example I can think of is if you have a function with a lof of arguments that you need to pass a multiline string to. For example: open_address(:write, <<ADDRESS, 10. ' '). Then you could define the multiline string below the function call instead of awkwardly making the function call span multiple lines.

So, that's what heredocs are, and that's how they're implemented in ruby.

Saturday, May 19, 2012

What is the equivalent of a lambda function in ruby and how is it used?

In python, you can define a lambda function with funct = lambda x: x*x, and from then on funct will be a callable function that takes one argument and returns the square of the argument. In ruby, the equivalent system is called a Proc (procedure) object.

A proc is basically just an object that takes a block at its instantiation. For example, you could say funct = Proc.new {|x| x*x}, and then funct would be a proc object equivalent to the python example above. Proc objects CANNOT be called like regular functions, however: you can't say puts funct(4), or anything like that. You CAN, however, invoke the block of the proc object with funct[4] (straight brackets) or by calling the call method: funct.call(6).

There are a couple of other notations for instantiating proc objects. If you say proc {block}, it's the exact same thing as saying Proc.new {block}. You can also say lambda {block}, which will give you almost the same proc object.

Interestingly, proc objects with lambda declarations and proc declarations are slightly different. Proc object defined procs, if passed more arguments than are accounted for in their blocks, will simply cut off the extra arguments and proceed like nothing happened. Lambda defined procs will, in the same situation, throw an ArgumentError. Example: funct[2,3,4,5] will return 4 if it was instantiated with proc {|x| x*x}, but it will ArgumentError if it was instantiated with lambda {|x| x*x}.

I am really not sure why this distinction was even included in ruby at all; perhaps it was a backwards compatibility issue, but it seems to me that any function, whether represented by an actual function or a proc object, should throw an error when given the incorrect number of arguments. Just seems like it would make bugs a lot easier to find that way. You can check whether the proc was generated with a lambda expression using the Proc.lambda? function.

Procs also have some other cool methods, specifically for hashing and finding the number of arguments and currying. I would again direct you to the ruby doc page for more info. As for the useful part...they're as useful as any other function, and maybe even more since they can me passed around. Entirely first class!

So that's what the equivalent of a lambda function is in ruby, and that's how it's useful.

Friday, May 18, 2012

What are blocks in ruby and how are they useful?

Blocks in ruby are an incredible thing, and they are unlike anything in any other programming language, at least that I've encountered. A block is delimited by curly braces of a do...end section.

For example, arrays in ruby have a method called each (although technically it's an Enumerator, but think of it as basically a method for now). You can say:

a = [1,2,3]
a.each {|i| print i}

and ruby will output something like 123. The part inside the curly braces is a block. You can also have a block like this:

a.each do |i|
print i
end

and it's the same thing. A block is basically a group of statements.

Each is basically a function that can take a block. To make your own function that can take blocks, you use the yield statement. For example, say you have a function like this (copied from ruby docs page):

def three_times
yield
yield
yield
end

This yield statement is different in ruby than it is in, say python. Any function that has a yield statement inside it must take a block when it is called. You pass it a block by simply writing the block directly after the function call. Every time the execution in the function reaches a yield statement, the block is evaluated. So, writing three_times {print 'hello '} would output hello hello hello.

You can also pass values to the yield statement, like this:

def one_ten
for i in 1..10
yield i
end
end

To access the value passed to yield, you write a block with a variable name in vertical lines at the front, like this: one_ten {|i| print i} or:

one_ten do |i|
print i
end

Both of those will print 12345678910 to the output.

Here, we're really talking about iterators. Iterators are functions that have the yield statement in them, e.g. they can return one or more times with different values depending on how their yield statements are organized. In ruby, it seems that iterators are linked intrinsically to blocks.

One more thing. In python, for example, yield statements are sort of like return statements, but in ruby, yield statements actually evaluate to the last statement processed in the block. (Notice that technically, yield is not a statement, it's a function itself.) So you can have something like this, for example:

def counting_up
a = 0
while a <= 10
a = yield a
end
end

counting_up do |yielded|
print yielded
yielded + 1
end

That will output 012345678910, because a will be set to whatever was yielded (a) plus 1 each time. Careful - if the yielded + 1 had not been there, the function would have run infinitely.

Blocks are useful in ruby in a couple of ways. One is the Array each method. There are other iterators as well built into ruby in different places: the Array.collect method, which takes a block and composes an array with the statements in the block applied to each item of the original array, is one. So something like [1,2,3].collect{|i| i+1} would output [2,3,4]. Again, look at the doc page for more examples.

So that's what blocks are in ruby, and how they're useful.

Thursday, May 17, 2012

What are hashes in ruby, and how are they different from regular hash tables?

A hash in ruby is basically a regular hash table with some cool added features. That is, it's a data structure that maps keys of any type (string, integer, mostly symbol, etc) to values (of not necessarily the same type).

Do define a hash in ruby, you type something like this: h = {:username => 'john302', :password => 'coolstuff', :age => 5}. Then to access the value stored with key :username, you would say h[:username], which in this example would evaluate to the string john302.

To define an empty hash, you can say h = {} or h = Hash.new (or h = Hash.new()).

The reason they're different from regular hash tables is you can define cool things like default values. For example, say you want a hash where an integer maps to its cube. You can type h = Hash.new({|hash, key| hash[key] = key * key * key}). This sets the default value of, say, h[5] to 125. I am pretty sure they use something akin to streams to do this. It's really interesting. Anyway, you can still store something in the hash, like, say, h[5] = 'cat', but if nothing is stored in a slot in the hash, it will default to its default value.

I think that the time it takes to do this is dependent on the function, though. For example, if you have a function that adds a key to itself a hundred times, mapping that function as the default for a hash will take a tiny amount of time. But computing a default value will actually do the work (add the key to itself a hundred times), so it may be slower. So hashes in ruby are not magic.

The really interesting thing is that this doesn't only work with integers. For example, if you say h = Hash.new {|hash,key| hash[key] = key + key}, then h['cat'] will default to 'catcat'. You have to be careful though, because not every operation is defined for every type. e.g., you shouldn't say h = Hash.new {|hash,key| hash[key] = key / key}, because then trying to say h[5] will evaluate to 1, but h['cat'] will raise a NoMethodError.

You can find out the other cool stuff about hashes at the documentation page on them.

So, that's what hashes are in ruby, and how they're different from regular hash tables.

What are arrays in ruby and how do they work?

Arrays are a built in data type in ruby, basically. You can define an array by typing something like a = [1,2,3], then you can get an element of the array with a[0], for instance. It's interesting to note that trying to say a[9] will NOT raise any sort of error, instead it will return a nil.

Just like in python, or any other mostly high-level language, arrays in ruby have a lot of built in methods. You can say [1,2,3] + [2,3,4], which will evaluate to [1,2,3,2,3,4], or you can do tricky things like intersection ([1,2,3] & [2,3,4] returns [2,3]) and union([1,2,3] | [2,3,4] returns [1,2,3,4]). Of course, array elements cannot only be integers, they can be whatever your heart desires.

There are as many or more built in methods for arrays as there are in python, and you can find them all at the documentation page for ruby arrays.

As for how they work, they are like arrays in any other language. They're an ordered sequence of objects, and the point is that you can access each one in constant time by providing its index. This is how they work.

So that's what arrays are, in ruby (and how they work). Or at least it's a start.

Tuesday, May 15, 2012

What is a ruby symbol?

Basically, a ruby symbol is an identifier that can't be changed. It's supposed to be, from my understanding, sort of a placeholder.

For example, say you have a bunch of hashes (dictionaries/hashtables). And they all have the same keys, but different values. Storing each key as a string would take up a ton of memory. Instead, you can store it as a symbol. With a symbol, each instance points to the same thing in memory.

To define a symbol, for example to store the name of something in a hash table, you type :name. You CANNOT say :name = 'john', because you can't "set" symbols to other things. Instead, you can define a hash or something like this: h = {:name => 'john', :occupation => 'carpenter'}.

Here, you are defining a representation of john the carpenter. You could have used strings as the keys in that hash table, but that's not really what strings are meant to be used for in ruby (as opposed to in python, where strings are immutable). You use the symbol as a placeholder, because it's easier and more conventional. Also, if you have 1,000 hashes to represent 1,000 people, using symbols will save you a pretty good chunk of memory.

You can get a string representation of the symbol if you need to with :name.to_s, but that doesn't make a lot of sense with a symbol. Basically, it seems to me that ruby symbols are a convenient way to enforce, or at least push people towards, a better convention for defining keys. Really, a symbol is a key, that you use to access something else.

That's what a ruby symbol is.

How do you declare variables and functions in ruby?

As it turns out, declaring variables and functions in ruby is fairly simple. To declare a variable, simply take a name, like string for instance, and assign it to a value with an = sign. For example, string = 'hello' would bind the name string to the string literal hello. Then, for example, puts(string) would print hello.

Functions are a little bit more complicated. To define a function, for instance to return the square of a number, type def square(x). Ruby will automatically return the last thing evaluated in the function body, so typing x *x will suffice for the next line. Ruby function definitions end with the end keyword. So the total function definition is:

def square(x)

x*x

end

You can then call this function with square(5) or square 5. The parentheses don't matter, if I'm right. You can also use the return statement, like in other languages.

That's how you define functions and variables in ruby.

Sunday, May 13, 2012

How do you download/install/run ruby?

Ruby can be downloaded using your favorite package manager. On debian based linux distributions with apt-get, you can simply type sudo apt-get install ruby1.9.1 to get version 1.9.2 or ruby (the naming convention has something to do with library dependencies, which I don't quite understand, but 1.9.2 is the version that gets installed). You can check what version you have with ruby --version.

On Mac OSX with homebrew, its just brew install ruby. Easy!

If you're on windows, you can download RubyInstaller.

You can also compile ruby from source, using the instructions on the download page. Ruby is written in C, and it's open source, so you can view the entire source code on github.

You can run ruby using the ruby command. Ruby source files have the extension .rb, so say you have a file that you've written called helloworld.rb; you can run this using ruby helloworld.rb.

You can also start the ruby interpreter using the irb command. After that, you can type any ruby command into the interpreter and it will evaluate it for you in real time, or you can type exit to close the interpreter.

That is how you download, install, and run ruby.