Quantcast

count * performance issue

classic Classic list List threaded Threaded
107 messages Options
1234 ... 8
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

count * performance issue

sathiya psql
count(*) tooks much time...

but with the where clause we can make this to use indexing,... what where clause we can use??

Am using postgres 7.4 in Debian OS with 1 GB RAM,

am having a table with nearly 50 lakh records,

it has more than 15 columns, i want to count how many records are there, it is taking nearly 17 seconds to do that...

i know that to get a approximate count we can use
         SELECT reltuples FROM pg_class where relname = TABLENAME;

but this give approximate count, and i require exact count...

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: count * performance issue

chris smith-9
sathiya psql wrote:
> count(*) tooks much time...
>
> but with the where clause we can make this to use indexing,... what
> where clause we can use??
>
> Am using postgres 7.4 in Debian OS with 1 GB RAM,
>
> am having a table with nearly 50 lakh records,

Looks suspiciously like a question asked yesterday:

http://archives.postgresql.org/pgsql-performance/2008-03/msg00068.php

--
Postgresql & php tutorials
http://www.designmagick.com/

--
Sent via pgsql-performance mailing list ([hidden email])
To make changes to your subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: count * performance issue

A. Kretschmer
In reply to this post by sathiya psql
am  Thu, dem 06.03.2008, um 11:13:01 +0530 mailte sathiya psql folgendes:
> count(*) tooks much time...
>
> but with the where clause we can make this to use indexing,... what where
> clause we can use??

An index without a WHERE can't help to avoid a seq. scan.


>
> Am using postgres 7.4 in Debian OS with 1 GB RAM,

PG 7.4 are very old... Recent versions are MUCH faster.



>
> am having a table with nearly 50 lakh records,
>
> it has more than 15 columns, i want to count how many records are there, it is
> taking nearly 17 seconds to do that...
>
> i know that to get a approximate count we can use
>          SELECT reltuples FROM pg_class where relname = TABLENAME;
>
> but this give approximate count, and i require exact count...

There aren't a general solution. If you realy need the exact count of
tuples than you can play with a TRIGGER and increase/decrease the
tuple-count for this table in an extra table.


Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

--
Sent via pgsql-performance mailing list ([hidden email])
To make changes to your subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: count * performance issue

Shoaib Mir
On Thu, Mar 6, 2008 at 5:08 PM, A. Kretschmer <[hidden email]> wrote:>
> am having a table with nearly 50 lakh records,
>
> it has more than 15 columns, i want to count how many records are there, it is
> taking nearly 17 seconds to do that...
>
> i know that to get a approximate count we can use
>          SELECT reltuples FROM pg_class where relname = TABLENAME;
>
> but this give approximate count, and i require exact count...

There aren't a general solution. If you realy need the exact count of
tuples than you can play with a TRIGGER and increase/decrease the
tuple-count for this table in an extra table.



Or do something like:

ANALYZE tablename;
select reltuple from pg_class where relname = 'tablename';

That will also return the total number of rows in a table and I guess might be much faster then doing a count(*) but yes if trigger can be an option that can be the easiest way to do it and fastest too.

--
Shoaib Mir
Fujitsu Australia Software Technology
shoaibm[@]fast.fujitsu.com.au
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: count * performance issue

sathiya psql
buy every time i need to put ANALYZE...
this takes the same time as count(*) takes, what is the use ??

On Thu, Mar 6, 2008 at 11:45 AM, Shoaib Mir <[hidden email]> wrote:
On Thu, Mar 6, 2008 at 5:08 PM, A. Kretschmer <[hidden email]> wrote:>
> am having a table with nearly 50 lakh records,
>
> it has more than 15 columns, i want to count how many records are there, it is
> taking nearly 17 seconds to do that...
>
> i know that to get a approximate count we can use
>          SELECT reltuples FROM pg_class where relname = TABLENAME;
>
> but this give approximate count, and i require exact count...

There aren't a general solution. If you realy need the exact count of
tuples than you can play with a TRIGGER and increase/decrease the
tuple-count for this table in an extra table.



Or do something like:

ANALYZE tablename;
select reltuple from pg_class where relname = 'tablename';

That will also return the total number of rows in a table and I guess might be much faster then doing a count(*) but yes if trigger can be an option that can be the easiest way to do it and fastest too.

--
Shoaib Mir
Fujitsu Australia Software Technology
shoaibm[@]fast.fujitsu.com.au

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: count * performance issue

Shoaib Mir
On Thu, Mar 6, 2008 at 5:19 PM, sathiya psql <[hidden email]> wrote:
buy every time i need to put ANALYZE...
this takes the same time as count(*) takes, what is the use ??



Dont you have autovacuuming running in the background which is taking care of the analyze as well?

If not then hmm turn it on and doing manual analyze then shouldnt I guess take much time!

But yes, I will say if its possible go with the trigger option as that might be more helpful and a very fast way to do that.

--
Shoaib Mir
Fujitsu Australia Software Technology
shoaibm[@]fast.fujitsu.com.au
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: count * performance issue

Mark Mielke
In reply to this post by Shoaib Mir
 
There aren't a general solution. If you realy need the exact count of
tuples than you can play with a TRIGGER and increase/decrease the
tuple-count for this table in an extra table.

Of course, this means accepting the cost of obtaining update locks on the count table.

The original poster should understand that they can either get a fast estimated count, or they can get a slow accurate count (either slow in terms of select using count(*) or slow in terms of updates using triggers and locking).

Other systems have their own issues. An index scan may be faster than a table scan for databases that can accurately determine counts using only the index, but it's still a relatively slow operation, and people don't normally need an accurate count for records in the range of 100,000+? :-)

Cheers,
mark

-- 
Mark Mielke [hidden email]
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: count * performance issue

sathiya psql
In reply to this post by Shoaib Mir
will you please tell, what is autovacuuming... and wat it ll do... is there any good article in this....

On Thu, Mar 6, 2008 at 11:56 AM, Shoaib Mir <[hidden email]> wrote:
On Thu, Mar 6, 2008 at 5:19 PM, sathiya psql <[hidden email]> wrote:
buy every time i need to put ANALYZE...
this takes the same time as count(*) takes, what is the use ??



Dont you have autovacuuming running in the background which is taking care of the analyze as well?

If not then hmm turn it on and doing manual analyze then shouldnt I guess take much time!

But yes, I will say if its possible go with the trigger option as that might be more helpful and a very fast way to do that.


--
Shoaib Mir
Fujitsu Australia Software Technology
shoaibm[@]fast.fujitsu.com.au

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: count * performance issue

Shoaib Mir


On Thu, Mar 6, 2008 at 5:31 PM, sathiya psql <[hidden email]> wrote:
will you please tell, what is autovacuuming... and wat it ll do... is there any good article in this....



Read this --> http://www.postgresql.org/docs/8.3/interactive/routine-vacuuming.html#AUTOVACUUM

--
Shoaib Mir
Fujitsu Australia Software Technology
shoaibm[@]fast.fujitsu.com.au
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: count * performance issue

A. Kretschmer
In reply to this post by Mark Mielke
am  Thu, dem 06.03.2008, um  1:26:46 -0500 mailte Mark Mielke folgendes:

>  
>
>         There aren't a general solution. If you realy need the exact count of
>         tuples than you can play with a TRIGGER and increase/decrease the
>         tuple-count for this table in an extra table.
>
>
> Of course, this means accepting the cost of obtaining update locks on the count
> table.
>
> The original poster should understand that they can either get a fast estimated
> count, or they can get a slow accurate count (either slow in terms of select
> using count(*) or slow in terms of updates using triggers and locking).
>
> Other systems have their own issues. An index scan may be faster than a table
> scan for databases that can accurately determine counts using only the index,

No. The current index-implementation contains no information about the
row-visibility within the current transaction. You need to scan the
whole data-table to obtain if the current row are visible within the
current transaction.


> but it's still a relatively slow operation, and people don't normally need an
> accurate count for records in the range of 100,000+? :-)

right.


Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

--
Sent via pgsql-performance mailing list ([hidden email])
To make changes to your subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: count * performance issue

sathiya psql
is there any way to explicitly force the postgres to use index scan

On Thu, Mar 6, 2008 at 12:06 PM, A. Kretschmer <[hidden email]> wrote:
am  Thu, dem 06.03.2008, um  1:26:46 -0500 mailte Mark Mielke folgendes:
>
>
>         There aren't a general solution. If you realy need the exact count of
>         tuples than you can play with a TRIGGER and increase/decrease the
>         tuple-count for this table in an extra table.
>
>
> Of course, this means accepting the cost of obtaining update locks on the count
> table.
>
> The original poster should understand that they can either get a fast estimated
> count, or they can get a slow accurate count (either slow in terms of select
> using count(*) or slow in terms of updates using triggers and locking).
>
> Other systems have their own issues. An index scan may be faster than a table
> scan for databases that can accurately determine counts using only the index,

No. The current index-implementation contains no information about the
row-visibility within the current transaction. You need to scan the
whole data-table to obtain if the current row are visible within the
current transaction.


> but it's still a relatively slow operation, and people don't normally need an
> accurate count for records in the range of 100,000+? :-)

right.


Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

--
Sent via pgsql-performance mailing list ([hidden email])
To make changes to your subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Fwd: count * performance issue

sathiya psql
In reply to this post by A. Kretschmer


---------- Forwarded message ----------
From: sathiya psql <[hidden email]>
Date: Thu, Mar 6, 2008 at 12:17 PM
Subject: Re: [PERFORM] count * performance issue
To: "A. Kretschmer" <[hidden email]>
Cc: [hidden email]


TRIGGER i can use if i want the count of the whole table, but i require for some of the rows with WHERE condition....

so how to do that ???


On Thu, Mar 6, 2008 at 12:06 PM, A. Kretschmer <[hidden email]> wrote:
am  Thu, dem 06.03.2008, um  1:26:46 -0500 mailte Mark Mielke folgendes:
>
>
>         There aren't a general solution. If you realy need the exact count of
>         tuples than you can play with a TRIGGER and increase/decrease the
>         tuple-count for this table in an extra table.
>
>
> Of course, this means accepting the cost of obtaining update locks on the count
> table.
>
> The original poster should understand that they can either get a fast estimated
> count, or they can get a slow accurate count (either slow in terms of select
> using count(*) or slow in terms of updates using triggers and locking).
>
> Other systems have their own issues. An index scan may be faster than a table
> scan for databases that can accurately determine counts using only the index,

No. The current index-implementation contains no information about the
row-visibility within the current transaction. You need to scan the
whole data-table to obtain if the current row are visible within the
current transaction.


> but it's still a relatively slow operation, and people don't normally need an
> accurate count for records in the range of 100,000+? :-)

right.


Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

--
Sent via pgsql-performance mailing list ([hidden email])
To make changes to your subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: count * performance issue

A. Kretschmer
In reply to this post by sathiya psql
am  Thu, dem 06.03.2008, um 12:13:17 +0530 mailte sathiya psql folgendes:
> is there any way to explicitly force the postgres to use index scan

Not realy, PG use a cost-based optimizer and use an INDEX if it make
sense.


>
> On Thu, Mar 6, 2008 at 12:06 PM, A. Kretschmer <
> [hidden email]> wrote:

please, no silly top-posting with the complete quote below.


Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

--
Sent via pgsql-performance mailing list ([hidden email])
To make changes to your subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: count * performance issue

A. Kretschmer
In reply to this post by A. Kretschmer
am  Thu, dem 06.03.2008, um 12:17:55 +0530 mailte sathiya psql folgendes:
> TRIGGER i can use if i want the count of the whole table, but i require for
> some of the rows with WHERE condition....
>
> so how to do that ???

Okay, in this case a TRIGGER are a bad idea. You can use an INDEX on
this row. Can you show us the output for a EXPLAIN ANALYSE SELECT
count(*) from <your_table> WHERE <your_row> = ... ?


Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

--
Sent via pgsql-performance mailing list ([hidden email])
To make changes to your subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: count * performance issue

sathiya psql

                                                          QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=205756.95..205756.95 rows=1 width=0) (actual time=114675.042..114675.042 rows=1 loops=1)
   ->  Seq Scan on call_log  (cost=0.00..193224.16 rows=5013112 width=0) (actual time=11.754..91429.594 rows=5061619 loops=1)
         Filter: (call_id > 0)
 Total runtime: 114699.797 ms
(4 rows)


it is now taking 114 seconds, i think because of load in my system.... any way will you explain., what is this COST, actual time and other stuffs....

On Thu, Mar 6, 2008 at 12:27 PM, A. Kretschmer <[hidden email]> wrote:
am  Thu, dem 06.03.2008, um 12:17:55 +0530 mailte sathiya psql folgendes:
> TRIGGER i can use if i want the count of the whole table, but i require for
> some of the rows with WHERE condition....
>
> so how to do that ???

Okay, in this case a TRIGGER are a bad idea. You can use an INDEX on
this row. Can you show us the output for a EXPLAIN ANALYSE SELECT
count(*) from <your_table> WHERE <your_row> = ... ?


Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

--
Sent via pgsql-performance mailing list ([hidden email])
To make changes to your subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


1234 ... 8
Loading...