Archive for July, 2009

Cloud Computing Services at Space-Time Research

Tuesday, July 28th, 2009 by Jo Deeker

I have been doing a lot of reading about cloud computing and concerns over security of data. In case you hadn’t noticed, cloud computing is a hot topic and IT magazines and blogs are overflowing with articles. Kundra is talking about it (Kundra courts the risk of innovation — Government Computer News ), Gov 2.0 and Data.gov encourage it, and some US city departments are investigating moving all their services into a cloud (L.A. weighs plan to replace computer software with Google service - Los Angeles Times )

At first I wondered what all the fuss was about - it’s only third party hosting of applications after all, and it’s already been done – A LOT. Over the last few weeks I’ve delved a bit deeper, and discovered that my understanding of the technology, and options available, was limited. There are a number of different ways applications can be hosted or delivered via a cloud, and putting your application on a separate server housed at an external provider, which is what we do for some of our existing clients, is a very simple but expensive way to do it. I’ve since discovered there are other ways that might be better.

I have worked at and with large organizations over the last 20 years, and I understand why the idea of moving applications into a cloud is attractive. Sometimes it can be nearly impossible for a business unit within an organization to get a server or space on a server to host applications. And if you can get one, for some organizations, it can cost up to hundreds of thousands of dollars even if the server itself only cost a few thousand. Here we have an opportunity to get rid of one of the major stumbling blocks in putting a new application (particularly a web-application) out there.

The potential benefits of cloud computing are clear:

  • It can be MUCH cheaper. We’ve worked out that a basic SuperVIEW application could be hosted for under sixty dollars a month (depending on number of users etc.) This compares with an external hosting service cost of $1500 AUD per month for a dedicated server.
  • It removes constraints imposed by IT departments, or even harder to deal with, IT Service Providers. The approvals to host applications on internal servers can be onerous.
  • It can offer scalability to scale up or down, particularly when there is an initial peak load. I’m hoping that when we launch 2011 census data online with the Australian Bureau of Statistics that we can use cloud resources to cope with our initial peak loads.
  • As the hardware and infrastructure are already available, it can be very quick to deploy at application and use it. No more waiting for the server to be ready.

The major considerations are:

  • Some cloud services offerings won’t tell you or guarantee where your data is stored and this makes some organizations nervous.
  • The technology and different options available are new and don’t necessarily follow strict government security procedures. I figure that by the time some government organizations are ready to launch an application it will sorted out.
  • Working out your optimal pricing can be a little tricky - it’s a bit like a mobile phone plan and if you don’t know how your system is going to be used, it can be hard to work out which is the most cost-effective model.

We have recently come up with a couple of cloud offerings for our SuperVIEW software that offer the best of both worlds for SuperSTAR customers. Our customers have given us some direct feedback that they are very interested in cloud models for hosting web applications, but they would like to keep their data in-house. This is not simply an issue of security; all of our customers have substantial data management systems in place, either fully in-house, or connected to privately outsourced data centres. Having the data for SuperVIEW hosted in-house ensures that the provider retains full ownership and does not have to extend its data management policies to address the differences that cloud computing would introduce.

Our HYBRID model fits this bill. The SuperVIEW application is hosted on a cloud provided by the Google App Engine. Via a secure data connector developed by STR, the application connects to a customer’s existing SuperSTAR database housed internally. Encrypted, aggregated data is returned to the web application for analysis and visualization in the SuperVIEW web client. Because SuperSTAR databases are read-only, and cannot be manipulated by SQL or other programs, the raw data is secure and is not vulnerable to alteration or attack.

We also want offer the ability to experience the whole SuperSTAR application in a cloud using a different service provider . Currently, we do provide fully hosted dedicated-server solutions, and over the next month we are working out who best to source these services from in a more distributed environment. There are some customers who will always want to keep their data management tools in-house, but others may want to migrate the whole solution to a cloud. We expect to be able to provide a hybrid, or fully cloud-based SuperSTAR service to customers with the next release of our software in the next month or so.

Until next time,

Jo

The Auto Correlation Engine

Sunday, July 12th, 2009 by Andrew Naish

An idea came to me after viewing the Campaingers embrace maps article from The Economist.

Say you had a bunch of data, and I’m not talking a couple of spreadsheets, I’m talking tens of millions of records, each holding attribute information… so much data you literally don’t know what to do with - like perhaps all the information collected by governments around the world in their yearly census. It’s too big to simply browse through to find out any useful information and there’s too many geographic layers to add into a G.I.S to do any manual spatial analytics on it. But you know there’s gold in the data somewhere. You know there must be some correlation between separate observations.

Enter the Auto Correlation Engine.

Imagine you had a system where by, for each geographic layer (State, Suburb, Region, Census District, etc, etc) you could attach a predefined observation (e.g, count, percentage, calculation) and derive all the possible spatial correlation indices amongst the observations, and report them to you.

E.G:

Let’s say your a government employee in charge of deciding what to do next about the high rates of child obesity in your district. Naturally as a G.I.S user you decide to add a new layer to your system displaying the count (and the lat/long points perhaps using proportional symbols) of obese children in your district. But what next? Do you add the fast food restaurants and perhaps do some concentric ring analysis? Do compare it with a layer displaying the number of game consoles bought in the area?

What if, you had a system that had already found out, for that layer, what other geographic layers and associated attributes have a high correlation index. As soon as you added the child obesity rates to your G.I.S platform, the Auto Correlation Engine would have predetermined that there is a correlation between high child obesity rates and the number of parks in the area, and informed you of the correlation. It would then ask you if you would like to add the correlated layer to your map. Of course it wouldn’t be one of those annoying Microsoft paperclips, but it might be useful.