eSpace

Tuesday, May 22, 2007

Ruby on Rails, web development that doesn't h...

Monday, May 21, 2007

Test Mail Server

You are on delivery, you need to test your application, and you need to make sure it sends email notification when the form is filled.. you look around wondering were did the system admin go.. and after a while when you catch him sitting in the buffet, and ask him for a mail server to test your logic, he stares at you for a while before replying " We got no mail server for testing". Well, at this point the degree of frustration reaches its peak, and you start wondering why in the world do we hire those sys admins?

well you won't need them anymore, just use Gmail, yes you can use your gmail account (sure you have one).. here is the configuration

address => "smtp.gmail.com",
port => 587,
domain => "yourdomain.com",
authentication => :plain,
user_name => "yourgmailaccount",
password => "yourgmailpassword"

Friday, May 18, 2007

MogileFS revisited

So i got this reply on my recent post

"Please recall that MogileFS has no POSIX file API. All file transfers
are done via HTTP. So, it really isn't a drop-in replacement for NFS
or any other network file system. You need to add logic to your
application to deal with MogileFS.

Also, you can't do updates to a file; you must overwrite the entire
file if you make any changes.

MogileFS is primarily intended for a write-once/read-many setup."

So how would this fit in our system, for a starter I think it won't be of much impact, since we are storing system images. The idea of updating files won't be an issue, as images intend to be very large, and once stored it is either replaced by a newer image or used to restore a system. Also we are going to use Ruby on Rails to interface with the system Imager, our ope source imaging system, and ruby has a plugin for MogilrFS, so it won't be a problem to integrate it, and everything seems ok.

What about other systems, how could be MogileFS useful in other systems.. Would these issues be a problem for application in need for a smart storage? Lets take a Mail system for example, we have multiple servers serving a domain, and users' mail boxes are spread among these servers, The file in this case will be the emails, and since we need no update on the emails, write once/read many condition will be fulfilled. Although if the mail service was not tailored made or customized, it will be hard to integrate MogileFS, meaning if you are using a ready made Mail server like Sendmail or Qmail, you will find difficulties to make MogileFS your storage engine.

As a conclusion MogileFS is better used with applications that are developed with MogileFS as its storage engine in mind. Although you can use it with out of the box systems, it won't be smooth ride, but fr sure there are some systems which will not benefit from MogileFS like file sharing or workflow systems.
Still I can't wait to try it over, and keeping you updated.

Thursday, May 17, 2007

Hi, I'm Ruby on Rails - Part 1

What do you get when you cross the Mac vs PC commercials and Rails Envy? Ruby on Rails ads to get everyone hyped for Railsconf, that's what!

Update a newly added column in a migration

One of the very interesting features I like about Rails is migrations. It is a version control system that keeps track of all database changes. You can easily move your database to any previous version with its schema and data.

During my last project, I have tried to create a migration that adds a column to a table and then updates that column.

def self.up
  add_column :file_types, :mime_type, :string
  q = FileType.find_by_name('quicktime')
  q.update_attributes :mime_type => 'video/quicktime'
end

If you run that migration, the new column would be added successfully but no data would be updated. Why is that ?!

The problem is that you are trying to update the column, mime_type, immediately after adding it and before allowing the model, FileType, to detect the new changes (strange, I know, but true).

The solution, as documented, is simple. You just need to call
reset_column_information to ensure that the model has the latest column data before the update process.

Here is the code modified:

def self.up
  add_column :file_types, :mime_type, :string
  q = FileType.find_by_name('quicktime')
  FileType.reset_column_information
  q.update_attributes :mime_type => 'video/quicktime'
end

And here is the code of reset_column_information

def reset_column_information
  read_methods.each { |name| undef_method(name) }
  @column_names = @columns = @columns_hash = @content_columns = @dynamic_methods_hash = @read_methods = @inheritance_column = nil
end

It simply resets all the cached information about columns, which will cause them to be reloaded on the next request.

Although this problem has a solution, a really worse problem should be mentioned here. In the first case, when you don't call reset_column_information, you don't get any error! The column simply doesn't update. Additionally, if you go back to the previous version and then re-run the migration, surprise, you get no problems and the column updates successfully!

I don't know if this is a reported bug, but it is a strange behavior. However, this won't prevent me from developing more and more Rails applications.

Wednesday, May 16, 2007

MogileFS Storage engine!

I came across this today, It seemed interesting.. MogileFS is intended for storage hungry applications, its all about spreading your files across cheap devices on different hosts, something like RAID+NFS+DataReplication.

The Idea is very nice and simple, you have multiple servers, and every server has multiple devices, you sum up all these storage units into one big storage, you have a tracker application that you consult when reading or writing to this huge storage, and the tracker take responsibility of saving your data and making sure that your data is available even if multiple hosts went off line.

This application just came in time, we just had an idea of a project that takes images from your server and store it on a network storage, so if something wrong happens to your server you can simply take this image and restore it back to your server, or you can even restore this image on a different server to clone it, or something like that. The challenge was where to store all of these images. By doing a simple calculation, if you have 100 users and every user has a 10 G.B. image, then you are bound to maintain a tera of storage.. and scalability will be an issue.

With MogileFS you will gain three advantages here, one, you will have cheap disks on cheap servers with your storage distributed on it. Two, you will gain from this distribution by installing the application on all of these servers, and so gaining high availability. Three, scaling will be as simple as adding a server to this farm. So with about half the price of a SAN and its expensive disks, you will get high availability for your storage and application. Ofcourse we will have to manage this distributed environment. One of the ways to tackle it is to create no slave architecture, all servers are masters, and every server can detect on which server the user’s image is stored by consulting the tracker. So when a user logs in, he will first go to any server according to Round and Robin algorithm, and from this server he will be redirected to the server storing his image, where he can get served, while eliminating the network communication overhead.

This architecture can be implemented with any storage intensive application, or any application that used to rely on NFS, as NFS has proven to be unreliable in heavy production environments.

I like this tool very much, and I can’t wait to test it on our application.. so I will keep you posted with any updates.

Saturday, May 12, 2007

Web Antivirus

Web Services are increasingly becoming an essential part of your everyday life. How much time you spend surfing the internet pages?

To be more specific how much you feel now that Google is too much involved in your daily routine? Can you imagine your life without Google? your search, your Calendar, your email, your blog, ...etc

Well, it seems that you will look for Google to be your web antivirus. Before you access a page, type the url in google search and pray that you won't get "this site may harm your computer".

you have just to obey, otherwise your PC will be affected.

the story begins with researchers from the firm surveyed billions of sites, subjecting 4.5 million pages to "in-depth analysis". Actually they found 450,000 pages guilty.

It is sufficient only one visit from you to make the attacker able to detect and exploit a browser

vulnerability. Therefore, the goal of the attacker becomes identifying web applications with vulnerabilities that enable him to insert small pieces of HTML in web pages.

An example for this is iframes, which can successfully install a malware binary "drive-by-download".

Are the web masters, or the site creators are responsible for this?
The answer is, it is not always the case.

User Contribution

Many web sites feature web applications that allow visitors to contribute their own content. This is often in the form of blogs, profiles, comments, or reviews. they usually support only a limited subset of the hypertext markup language, but in some cases poor sanitization or checking allows users to post or insert arbitrary HTML into web pages.

Advertising
Although web masters have no direct control over the ads themselves, they trust advertisers to show non-malicious content. Sometimes, advertisers rent out part of their advertising space; in this case the web master needs to trust the ads provided from a company that might be trusted by the first advertiser. And so on, you may find nested relations which considered as pitfall in the trust relation by making it a transitive one.

Third-Party Widgets
A third-party widget is an embedded link to an external JavaScript or iframe that a web master uses to provide additional functionality to users. Example for this, Google Analytics :)

Webserver Security

The contents of a web site are only as secure as the set of applications used to deliver the content, including the actual HTTP server, scripting applications (e.g. PHP, ASP etc.) and database backends. If an attacker gains control of a server, he can modify its content to his benefit. For example, he can simply insert the exploit code into the web server’s templating system. As a result, all web pages on that server may start exhibiting malicious behavior. Although the team has observed a variety of web server compromises, the most common infection vector is via vulnerable scripting applications. They observed vulnerabilities in phpBB2 or InvisionBoard that enabled an adversary to gain direct access to the underlying operating system. That access can often be escalated to super-user privileges which in turn can be used to compromise any web server running on the compromised host. This type of exploitation is particularly damaging to large virtual hosting farms, turning them into malware distribution centers.

Exploitation Mechanisms
A popular exploit they encountered takes advantage of a vulnerability in Microsoft’s Data Access Components that allows arbitrary code execution on a user’s computer.
Typical steps taken to leverage vulnerability into remote code execution:

The exploit is delivered to a user’s browser via an iframe on a compromised web page.
The iframe contains Javascript to instantiate an ActiveX object that is not normally safe for scripting.
The Javascript makes an XMLHTTP request to retrieve an executable.
Adodb.stream is used to write the executable to disk.
A Shell.Application is used to launch the newly written executable.

Another popular exploit is due to a vulnerability in Microsoft’s WebViewFolderIcon. The exploit Javascript uses a technique called "heap spraying" which creates a large number of Javascript string objects on the heap. Each Javascript string contains x86 machine code (shellcode) necessary to download and execute a binary on the exploited system. By spraying the heap, an adversary attempts to create a copy of the shellcode at a known location in memory and then redirects program execution to it.

Detecting Dangerous Pages
Simply, by monitoring the CPU and the processes executed on accessing the page. When some unknown processes are added to the list, this will be a strong sign that a drive-by download has happened.

Google will be more and more involved into our life, it will report to you malicious sites for free....
anyway, it is not a big deal, you can do it yourself for some levels. but there a little bit sophisticated cases when you need multilevel reverse engineering...

Reference: Google Research Paper

Update:
Google online security blog, the latest news and insights from Google on security and safety on the internet.
Microsoft takes actions to defend vulnerabilities claim.