Mark Needham

Thoughts on Software Development

Rules of Thumb: Don’t use the session

with 16 comments

A while ago I wrote about some rules of thumb that I’d been taught by my colleagues with respect to software development and I was reminded of one of them – don’t put anything in the session – during a presentation my colleague Luca Grulla gave at our client on scaling applications by making use of the infrastructure of the web.

The problem with putting state in the session is that it means that requests from a specific user have to be tied to a specific server i.e. we have to use a sticky session/session affinity.

This reduces our ability to scale our system horizontally (scale out) i.e. by adding more servers to handle requests.

If, for example, we have a small amount of users (whose first request went to the same server) making a lot of requests (perhaps through AJAX calls) then we may quickly put one of our servers under load while the others are sitting there idle.

In addition we have increased complexity around our deployment process.

If we want to do an incremental deployment of a new version of our website across some of our servers then we need to ensure that we create a copy of any sessions on those servers and copy them to the ones we’re not updating so that any users still on the system don’t experience loss of data.

There are no doubts products which can allow us to do this more easily but it seems to me to be an unnecessary product in the first place since we can just design our application to not rely on the session.

As I understand it the web was designed to be stateless i.e. each request is independent and all the information is contained within that request and the idea of the session was only something which was added in later on.

How does the way we code change if we don’t use the session?

One thing we’ve often used the session for on projects that I’ve worked on is to store the current state of a form that the user is filling in.

When they’ve completed the form then we would probably store some representation of what they’ve entered in a database.

If we don’t use the session then we need to store this intermediate data somewhere and include a key to load it in the request.

On the project I’m working on at the moment we’re storing that data in a database but then clearing out that data every other day since it’s not needed once the user has completed the form.

An alternative perhaps could be to store it in a cache since in reality all we have is a key/value pair which we need to keep for a relatively short amount of time.

Advantages/disadvantages of this approach

The disadvantage of this approach is that we have to make more reads and writes to the database to deal with this temporary data.

Apart from the advantages I outlined initially, we are also more protected if a server handling a user’s request goes down.

If we were using the session to store intermediate state then that information would be lost and they would have to start over.

In the approach we’ve using this isn’t a problem and when the request is sent to another server we can still query the database and get whatever data the user had already saved.

As with most things there’s a trade off to be made but in this case it seems a fair one to me.

Alternative approaches

I’ve come across some alternative approaches where we avoid using the session but don’t store intermediate state in a database.

One way is to store that state in hidden fields on the form and another is to send it in the request parameters.

Neither of these approaches seem particularly clean to me and they give the user an easier way to change the intermediate data in ways that the form might not allow them to do.

From my experience our server side code becomes more complicated since we’re always writing all of the data entered so far back into the page.

In addition the url becomes a complete mess with the second approach.

Written by Mark Needham

February 16th, 2010 at 11:19 pm

Posted in Coding

Tagged with

  • DJ

    Martin Fowler documented these in Patterns of Enterprise Application Architecture book.

  • http://sarahtaraporewalla.blogspot.com/ Sarah Taraporewalla

    There is a product called ScaleOut (http://www.scaleoutsoftware.com/) that gives you this functionality – it provides the session handling so your app servers can scale horizontally.

    I’m not saying that I advocate sessions, but what I am saying is that just because you have a session does not mean you cannot scale horizontally.

  • http://www.markhneedham.com Mark Needham

    @DJ – ah really, I’ve read most of that book, guess I missed that part. Will have a look.

    @Sarah – I think someone did point that out during the brown bag session actually. I imagine that type of software probably costs much more than it would cost to scale out your application if it didn’t use the session though?

  • http://www.isanchez.net Ivan Sanchez

    On my current project we store sessions in a data grid. We don’t see reason to persist them in database. The objects are small (or at least should be), don’t last for very long and need easy access. In this scenario I believe any form of distributed memory is the best fit to handle large number of sessions.

  • http://www.leonardoborges.com Leonardo Borges

    There is one additional approach which is to use cookies to store session information.

    This is the default approach in Rails for instance and, as long as you don’t hang huge objects in the session – which is a bad idea anyway – works quite well.

    The idea of storing those in the database isn’t bad either… heavy traffic websites like ebay use that exactly because they need to scale.

    And you could always put a memcache server in between to reduce the latency.

  • http://galilyou.blogspot.com/2010/02/why-this-is-not-possible-in-c-generics.html Galilyou

    Mark, thanks for the post. In my last project I used to store everything I wanted in cookies, which was working like charm for me and does scale perfectly on many servers (as long as the encrypted cookies are readable by all the servers) .However, in my current project we are using sessions very heavily to store really deep domain objects. It’s going to be an intranet site for a bunch of hundreds of employees, and I think with this small amount scalability is not much of a problem. I think in such a case using sessions (even with relatively large objects) is tolerable. Just my 2 cents.

  • http://smsohan.blogspot.com S. M. Sohan

    You can use session that is stored inside a database. .Net/Ruby on Rails has built in support for this. Having the session inside a database means, when you scale out, your applications can simply use that database and the need for memory copying is no more required!

  • http://www.davidron.com David Ron

    @Leonardo: To circumvent the “large object” issue, you can use HTML5 Localstorage and HTML5′s embedded SQL server to store megabytes and megabytes of information. If you couple this with some simple web services for server-side validation (credit card authorization, etc) it should work quite nicely.

    I guess we have to wait for that to become slightly more available (Internet Explorer).

  • http://www.stripesbook.com Frederic Daoud

    Hi Mark,

    Nice article. I am interested in trying out a solution where the session is not used at all, and state is saved to the database as you describe.

    My question is, how do you associate the user to the data in the database? In a session-less solution, you can’t store the user ID in the session and use that to find the associated data in the database.

    What do you suggest? A hidden field in the form? What would be the value? A randomly generated, hashed value to prevent users from trying to hack the value and get someone else’s data?

    I’d love your input on this. Thanks!

  • http://www.stripesbook.com Frederic Daoud

    I guess you could just store a generated id in a cookie and use that to retrieve the information from the database. If the id is random and hashed, it should be relatively hack-proof.

    What do you think?

  • http://jclaes.blogspot.com Jef Claes

    You can already store SessionState in your database with ASP.NET and SQL Server out-of-the-box.

    This way you can rely on Session like you would normally do and when it’s time to scale out, put everything in SQL Server?

    More info here: http://support.microsoft.com/kb/317604

  • http://rafanoronha.net/ Rafael Noronha

    Frederic,

    Session id is defined within http conversation, and probably you can count on it anyway.

    Maybe you need a little hack at your web framework for defining the Session id for the first time, because it is probably defined when you deal with the session through the framework.

    Mark,

    Wouldn’t a good infrastructure framework do the job transparently?

  • Dave

    Session replication is common and trivial.

  • Steve

    This advice seems a little dated. It should be no problem storing sessions on the server with a shared cache. For some data, if it is not needed on the server, you can use a technology like Gears.

    I have seen sites store information in hidden fields, but they end up transmitting large http requests for simple transactions because absolutely everything is in the hidden field and it is all required to process the request.

    Why are you keeping session info in the database after the session is expired if you do not plan on keeping it long term?

  • http://arne.syscdn.de Arne Riemann

    We are altough storing our sessions inside an database table. The Zend Framework has an Session adapter wich supports this approach.

  • Frans

    First off, sessions CAN be made to work across server farms so the scalability issue is not too accurate:
    In IIS for instance, there are 3 methods for session management: InProc (fastest, not scalable as u said, data stored on the local server). SessionStateServer (still fast, data stored in a SEPARATE server as to mitigate the need to copy all state data to all the servers). and SQL (the session data is stored in SQL database automatically. u just provide the connection string in web.config).
    That being said, it IS possible to get sessions to work across farms.

    Second, regarding cookies: if you’re trying to store ultra large objects in a session, there is a compromise: which is the limiting factor for you: server memory or bandwidth?
    If it’s server memory then by all means you should avoid sessions, use cookies (FormsAuthenticationTickets), or send them in html forms. but bare in mind, this means the data is going back and forth from the client to the server (unnecessarily) and you’re wasting bandwidth. not to mention exposing the data to the client.