postgresql - Is Hadoop Suitable For This? -

May 15, 2014

we have postgres queries take 6 - 12 hours complete , wondering if hadoop suited doing faster. have (2) 64 core servers 256gb of ram hadoop use.

we're running postgresql 9.2.4. postgres uses 1 core on 1 server query, i'm wondering if hadoop 128 times faster, minus overhead.

we have 2 sets of data, each millions of rows.

set one:

 id character varying(20), a_lat double precision, a_long double precision, b_lat double precision, b_long double precision, line_id character varying(20), type character varying(4), freq numeric(10,5)

set two:

 a_lat double precision, a_long double precision, b_lat double precision, b_long double precision, type character varying(4), freq numeric(10,5)

we have indexes on lat, long, type, , freq fields, using btree. both tables have "vacuum analyze" run right before query.

the postgres query is:

select     id     setone 1     not exists (         select             'x'                     settwo 2                     two.a_lat >= one.a_lat - 0.000278 ,              two.a_lat <= one.a_lat + 0.000278 ,             two.a_long >= one.a_long - 0.000278 ,              two.a_long <= one.a_long + 0.000278 ,             two.b_lat >= one.b_lat - 0.000278 ,              two.b_lat <= one.b_lat + 0.000278 ,             two.b_long >= one.b_long - 0.000278 ,              two.b_long <= one.b_long + 0.000278 ,             (                 two.type = one.type or                 two.type = 's'             ) ,             two.freq >= one.freq - 1.0 ,             two.freq <= one.freq + 1.0         ) order     line_id

is type of thing hadoop can do? if can point me in right direction?

try stado @ http://stado.us. use branch: https://code.launchpad.net/~sgdg/stado/stado, used next release.

even 64 cores, using 1 core process query. stado can create multiple postgresql-based "nodes" on single box , leverage parallelism , cores working.

in addition, have had success converting correlated not exists queries (select count(*) ...) = 0.

Search This Blog

Live

postgresql - Is Hadoop Suitable For This? -

Comments

Post a Comment

Popular posts from this blog

php - Calling a template part from a post -

Firefox SVG shape not printing when it has stroke -

How to mention the localhost in android -