Hary

Thanks for the clarification and pointing out the drawbacks with using HDF5. I am all for open source software so if MySQL etc. works then great.�

I found TransCAD really fast in processing large (though not using 10 GB) datasets esp. when sorting and found it much faster than standard statistical packages.�

Krishnan




On Fri, Nov 15, 2013 at 9:03 AM, hprawiranata mitcrpc.org <hprawiranata@mitcrpc.org> wrote:
Krishnan,

HDF was data format, all climate data is stored on HDF format (I
experience writing C code for this data format on SPARC computers long
time ago and from HDF I converted to MySQL ! ). CTPP is just a flat
data set, txt. Converting into HDF is a different complicated story
and on a wrong direction.

Linking data tables from many tables has to be done using database
engine. MS Access won't cut it. M$ SQL is expensive. Free, Open Source
is the only way and fast.

TransCAD... I never test the limit. But linking simple tables is ok.
Large and many tables ? Better use data modeler software and link to
database engine.

Hary(ono) Prawiranata
Transportation Analyst/Modeler
Tri-County Regional Planning Commission
3135 Pine Tree Rd. Ste 2C
Lansing, MI 48911

On Thu, Nov 14, 2013 at 9:58 PM, Krishnan Viswanathan
<krisviswanathan@gmail.com> wrote:
> Mara
>
> Besides SQL server I have the following suggestions:
> 1) the ff package in R (
> http://www.bnosac.be/index.php/blog/22-if-you-are-into-large-data-and-work-a-lot-package-ff)
> 2) HDF5 seems like a decent option though I have not used it. Link to rhdf5
> ( http://bioconductor.org/packages/release/bioc/html/rhdf5.html). Also,
> SFCTA has some code for getting data into and out of HDF5 (
> https://github.com/sfcta/TAutils/tree/master/hdf5)
> 3) I have found TransCAD to be efficient in processing large datasets.
>
> Hope this helps.
>
> Krishnan
>
> I downloaded the Maryland state raw data (the whole enchilada) that Penelope
> was good enough to provide me. �It came with documentation that clearly
> explains what needs to be done but I am being hampered by the sheer size of
> the dataset. �It's 10 GB and that's without going into joining tables,
> transposing them to meet my needs, etc. �Even breaking the parts into
> different databases it can't be handled in Access. �I can fit Part 1 into an
> ESRI geodatabase but I don't have the flexibility in linking tables that
> Access has.
>
>
>
> Does anyone have any suggestions for dealing with large databases? �SQL
> server is one option. �Are there others?
>
>
>
> Mara Kaminowitz, GISP
> GIS Coordinator
> .........................................................................
> Baltimore Metropolitan Council
> Offices @ McHenry Row
> 1500 Whetstone Way
> Suite 300
> Baltimore, MD �21230
> 410-732-0500 ext. 1030
> mkaminowitz@baltometro.org
> www.baltometro.org
>
>
>
>
> _______________________________________________
> ctpp-news mailing list
> ctpp-news@ryoko.chrispy.net
> http://ryoko.chrispy.net/mailman/listinfo/ctpp-news
>
>
> _______________________________________________
> ctpp-news mailing list
> ctpp-news@ryoko.chrispy.net
> http://ryoko.chrispy.net/mailman/listinfo/ctpp-news
>
_______________________________________________
ctpp-news mailing list
ctpp-news@ryoko.chrispy.net
http://ryoko.chrispy.net/mailman/listinfo/ctpp-news



--
Krishnan Viswanathan
5628 Burnside Circle
Tallahassee FL 32312