My first wrangle with NoSQL. An explanation of what I made and my thoughts on NoSQL versus the traditional RDBMS

I had a random idea the other day. I wanted to write a server that would serve, generate and store image manipulations on the fly. So that’s what I made. You can check out the GitHub repository here. The server itself takes HTTP requests much like a standard web server. It takes an image id and any sizing or cropping parameters you would like and returns the image. When you pass sizing or cropping parameters it checks to see if the image with those parameters already exists and returns that if it does. This avoids regenerating images all the time. It stores information about the images in the Mongo database.

This was sort of a new field to me as it was my first venture into the world of NoSQL. I really like the way everything works. I think this project worked well with the “Unstructured Data” that NoSQL is really good for. Different images could have a couple of sets of different attributes (and of course this would expand if I added more types of manipulations). It would certainly have been doable in an relational database, but I think this works better.

MongoDB uses a notation called BSON for its definitions and queries. BSON is much like the  - probably more – familiar JSON, except with the ability to have binary objects (hence the B in BSON). Defining the data in BSON is rather trivial and, in my opinion, more natural than SQL (and certainly less verbose). I also like the fact that you can have nested data. In some instances this can be used as opposed to the using a foreign key in an RDBMS.

I did a good bit of reading before I dove into NoSQL to see if it was really any better. I had heard lots on how “NoSQL is the new wave and it’s so much better!”. Well the answer I found: It depends. There’s lots of arguments that say NoSQL, because of it’s unstructured nature, is much better at scaling. However, there’s plenty that say RDBMSs scale just fine too. I think it’s not that one can scale up better than the other, I think it’s just easier to scale up a NoSQL database in terms of code implementation. What I really got out of my reading was that the reason to use NoSQL is not necessarily for scaling, but really, it comes down to whether you need the ability to have unstructured data. As I stated, what I did with my server was not necessarily completely structured, however it could still have been implemented in typical relational style. I’ve heard that there are scenarios where you would need to have unstructured data storing abilities. I assume it’s true, though I can’t really think of an example so who knows.

All in all, I like the interactions with MongoDB a lot better than I do using SQL like I would with MySQL (Which is what I have been using for data storage since I started programming 8 years ago). However, I could get very similar interactions by using an ORM (Object Relational Mapper). Those pretty much keeps the SQL away from my eyes and lets me use an Object-Oriented interface with the server. In general, for my needs, I could use either one. I don’t really have to worry about the scale issue (for now anyways) but like i said, it really comes down to the need for structured vs. unstructured data.

I should note: my GitHub project code is not ‘ready for distribution’. It was really just a test project. The code is there for anyone to play with and modify, or look at as an example. It wont run straight away if you download it and fire it up. You will need to modify it for your environment.

 

Here is a little tutorial for handling multiple simultaneous connections in C#.

The trick to doing asynchronous I/O with C# sockets is the AsyncCallback. You call the socket.Begin* methods, passing them an AsyncCallback object (which is a method) and a state object. The state object you pass is the socket itself. When the callback is called, it is passed an IAsyncResult. This contains the AsyncState, which is the state object you passed. You can cast it into a Socket and continue processing. Now we can get to the code:

The first thing we need is to include the proper references:

using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Sockets;
using System.Text;
using System.Threading;

You may be wondering why we need System.Threading. This is because we need a ManualResetEvent. This is used to signal events between the methods.

We’ll now write a class called ServerRunner, which starts the serving by its method Run(). It has 3 other methods, AcceptCon(), SendData(), and ReceiveData(). All 3 methods take an IAsyncResult “iar”.

First we need a couple of class variables

        private Byte[] data = new Byte[2048];
        private int size = 2048;
        private Socket server;
        static ManualResetEvent allDone = new ManualResetEvent(false);

This gives us some stuff for the actual transmission of the data, and of course the ManualResetEvent that I explained earlier. Heres our Run method that starts everything:

        public void Run()
        {
            try
            {
                server = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);
                IPEndPoint iep = new IPEndPoint(IPAddress.Any, 33333);
                server.Bind(iep);
                Console.WriteLine("Server initialized..");
                server.Listen(100);
                Console.WriteLine("Listening...");
                while (true)
                {
                    allDone.Reset();
                    server.BeginAccept(new AsyncCallback(AcceptCon), server);
                    allDone.WaitOne();
                }
            }
            catch (Exception e)
            {
                Console.WriteLine(e.ToString());
            }
        }

This starts like any listening socket. Create an IPEndPoint and bind your server socket to it. Then call the listen method. Then you want to start an infinite loop calling the BeginAccept() method with the AsyncCallback and state object. Around it, you want your ManualResetEvent’s Reset() and WaitOne() methods. This makes it so it waits until the connection has actually been accepted and started to be dealt with before it can start to accept a new one. In the next method You’ll see the ManualResetEvent’s Set() method, which tells it that it is ok to continue to the next connection. Heres the AcceptCon() method we put as the AsyncCallback to the BeginAccpet()

        void AcceptCon(IAsyncResult iar)
        {
            allDone.Set();
            try
            {
                Socket oldserver = (Socket)iar.AsyncState;
                Socket client = oldserver.EndAccept(iar);
                Console.WriteLine(client.RemoteEndPoint.ToString() + " connected");
                byte[] message = Encoding.ASCII.GetBytes("Welcome");
                client.BeginSend(message, 0, message.Length, SocketFlags.None, new AsyncCallback(SendData), client);
            }
            catch (Exception)
            {
                Console.WriteLine("Connection closed..");
                return;
            }
        }

In this method, first we call the ManualResetEvent’s Set() method, which tells it that we have gotten what we need. Then we cast the iar.AsyncState (the state object we passed into the method, which was a Socket) back to what it originally was so we can use it. This code sends a simple “Welcome” message to the client that connects. However you can choose to do whatever you want. We then call the BeginSend method, again with an AsyncCallback (this time to the SendData() method) and a state object (this time client socket).

        void SendData(IAsyncResult iar)
        {
            try
            {
                Socket client = (Socket)iar.AsyncState;
                int sent = client.EndSend(iar);
                client.BeginReceive(data, 0, size, SocketFlags.None, new AsyncCallback(ReceiveData), client);
            }
            catch (Exception)
            {
                Console.WriteLine("Connection closed..");
                return;
            }
        }

This method finishes off the send, and then starts to listen for more data by calling the ReceiveData() method as an AsyncCallback, again passing the client socket as a state object.

        void ReceiveData(IAsyncResult iar)
        {
            try
            {
                Socket client = (Socket)iar.AsyncState;
                int recv = client.EndReceive(iar);
                if (recv == 0)
                {
                    client.Close();
                    server.BeginAccept(new AsyncCallback(AcceptCon), server);
                    return;
                }
                string receivedData = Encoding.ASCII.GetString(data, 0, recv);
                // process received data here
                // decide what to send back
                byte[] message2 = Encoding.ASCII.GetBytes("reply");
                client.BeginSend(message2, 0, message2.Length, SocketFlags.None, new AsyncCallback(SendData), client);
            }
            catch (Exception)
            {
                Console.WriteLine("Connection closed..");
                return;
            }
        }

This is where we do all the data handling. It takes in data, does what you need to do with the data, and sends back a response. In this method we check to see if the socket is done, in which case we close it, call BeginAccept again to continue listening, and return to end the method execution. This method doesn’t actually have any data handling in it, it simply sends the string “reply” as a response to every piece of data that comes in. But I left comments showing you where to put your methods to actually deal with the data and come up with a response. When we are done handling the data, we call the BeginSend, which sends off the data, and then goes back to receiving again. It continues until the connection is closed.

A small warning about this code: The only exception handling in here is to keep the server from crashing if the client disconnects unexpectedly. If you are planning to use this as any sort of production code, I suggest you put in much more detailed exception handling.

Well. There it is. It’s much simpler than I thought it was going to be, and it only requires those 3 methods really. Hope you can all put this to good use.

© 2012 Code Brain Suffusion theme by Sayontan Sinha