Bug #15343 Wrong SUM Calculation with Join and GROUP BY
Submitted: 30 Nov 2005 8:27 Modified: 2 Jan 2006 17:37
Reporter: Robert Klemme Email Updates:
Status: No Feedback Impact on me:
None 
Category:MaxDB Severity:S2 (Serious)
Version:7.5.00.30 OS:Linux (RedHat EE 4)
Assigned to: CPU Architecture:Any

[30 Nov 2005 8:27] Robert Klemme
Description:
MaxDB returns wrong results (less than would be correct) for a SUM() aggregation if joined with another table (1 record) and two (2) columns in GROUP BY clause (see queries).

Unfortunately I couldn't come up with a test scenario that will safely reproduce this behavior. But I regard the issue serious enough to not leave it unreported.

How to repeat:
Unfortunately I haven't been able to come up with a test case that certainly will reproduce this error.  I will try to come up with more information as soon as possible.  Here's what I got so far:

db parameter JOIN_OPERATOR_IMPLEMENTATION ='YES' or 'NO' (doesn't show up when set to 'IMPROVED').

DDL of involved tables and views:

CREATE TABLE ag_hourlymaster
(
   timestampid         TIMESTAMP         NOT NULL,
   groupid             FIXED(5)    NOT NULL,
   urlid               FIXED(9)    NOT NULL,
   categoryid          FIXED(6)    DEFAULT 1 NOT NULL,
   userid              FIXED(9)    NOT NULL,
   referrerid          FIXED(9)    NOT NULL,
   serverinfoid        FIXED(9)    NOT NULL,
   applianceid         FIXED(9)    NOT NULL,
   transcodeid         FIXED(9)    NOT NULL,
   actionid            FIXED(5)    NOT NULL,
   status              FIXED(9)    NOT NULL,
   userinfoid          FIXED(9)    NULL,
   refcount            FIXED(9)    NULL,
   avgresponsetime     FIXED(14,3) NULL,
   avgbytecount        FIXED(14,3) NULL,
   avgbytesdropped     FIXED(14,3) NULL,
   avgpacketsrequested FIXED(14,3) NULL,
   avgpacketsserved    FIXED(14,3) NULL,
   avgpacketsresent    FIXED(14,3) NULL,
   avgpacketsdropped   FIXED(14,3) NULL,
   avgplaytime         FIXED(14,3) NULL,
   bandwidth           FIXED(14,3) NULL,
   avgconnecttime      FIXED(14,3) NULL,
   avgbrowsetime       FIXED(14,3) NULL,
   reqmodid            FIXED(6)    NULL,
   respmodid           FIXED(6)    NULL,
   virusid             FIXED(9)    DEFAULT 1 NOT NULL,
   CONSTRAINT PK_hourlymaster PRIMARY KEY
            ( timestampid, applianceid, virusid, categoryid,
              transcodeid, serverinfoid, actionid,
              userid, status, urlid, referrerid, groupid)
)
//
CREATE INDEX IDX_hourly_userid ON ag_hourlymaster(userid)
//
CREATE INDEX IDX_hourly_appid ON ag_hourlymaster(applianceid)
//
CREATE VIEW v_ag_hourlymaster AS
SELECT  timestampid,
        groupid,
        urlid,
        categoryid,
        virusid,
        userid,
        referrerid,
        serverinfoid,
        applianceid,
        transcodeid,
        actionid,
        0 AS abortcodeid,
        status,
        userinfoid,
        refcount,
        avgresponsetime,
        avgbytecount,
        avgbytesdropped,
        avgpacketsrequested,
        avgpacketsserved,
        avgpacketsresent,
        avgpacketsdropped,
        avgplaytime,
        bandwidth,
        avgconnecttime,
        avgbrowsetime,
        reqmodid,
        respmodid,
        YEAR(timestampid)                AS dt_year,
        MONTH(timestampid)               AS dt_month,
        DAY(timestampid)                 AS dt_day,
        DAYOFWEEK(timestampid)           AS dt_dayofweek,
        HOUR(timestampid)                AS dt_hour,
        FLOOR((MONTH(timestampid)-1)/3)+1  AS dt_quarter,
        DATE(timestampid)                AS dt_date,
        (CASE WHEN ((transcodeid > 40 AND transcodeid < 56 ) OR
                    transcodeid = 201 OR transcodeid = 230 OR transcodeid = 231)
                 THEN 1
                 ELSE 0
         END) AS cache_hits,
        (CASE WHEN (transcodeid = 90 OR transcodeid = 145 OR transcodeid = 96)
                 THEN 1
                 ELSE 0
         END) AS reload_refs,
        0 AS abort_refs,
        (CASE WHEN (transcodeid > 89 AND transcodeid < 146)
                 THEN 1
                 ELSE 0
         END) AS non_cacheable_object_refs,
        (CASE WHEN ((transcodeid > 40 AND transcodeid < 56 ) OR
                    transcodeid = 201 OR transcodeid = 230 OR transcodeid = 231)
                 THEN 1
                 ELSE 0
         END) AS obj_hit_rate,
        (CASE WHEN ((transcodeid > 40 AND transcodeid < 56 ) OR
                    transcodeid = 201 OR transcodeid = 230 OR transcodeid = 231)
                 THEN avgbytecount
                 ELSE 0
         END) AS byte_hit_rate,
        (CASE WHEN transcodeid < 201
                THEN 1
                ELSE 0
         END) AS tcp_refs,
        (CASE WHEN (transcodeid > 200 and transcodeid < 230)
                THEN 1
                ELSE 0
         END) AS icp_refs,
        (CASE WHEN ((transcodeid > 70 AND transcodeid < 180) OR
                    transcodeid = 221 OR transcodeid = 232)
                THEN 1
                ELSE 0
         END) AS miss_refs,
        (CASE WHEN ((transcodeid > 40 AND transcodeid < 56 ) OR
                    transcodeid = 201 OR transcodeid = 230 OR transcodeid = 231)
                THEN avgbytecount
                ELSE 0
         END) AS cache_bytes,
        (CASE WHEN ((transcodeid > 70 AND transcodeid < 180) OR
                    transcodeid = 221 OR transcodeid = 232)
                THEN avgbytecount
                ELSE 0
         END) AS miss_bytes,
        (CASE WHEN (transcodeid > 89 AND transcodeid < 146)
                THEN avgbytecount
                ELSE 0
         END) AS non_cacheable_bytes
    FROM ag_hourlymaster
//
CREATE TABLE ag_abortcode
(
   abortcodeid FIXED(5)     CONSTRAINT pk_abortcode PRIMARY KEY,
   abortcode   VARCHAR(100) NOT NULL,
   description VARCHAR(255) NULL
)
//
CREATE UNIQUE INDEX IDX_abortcode
     ON ag_abortcode ( abortcode )
//
INSERT INTO ag_abortcode ( abortcodeid, abortcode, description ) VALUES ( 0, '-', '-' )

Query with correct result:

SELECT count(*) AS COL3, SUM(TA2.refcount)
FROM ag_abortcode TA1, v_ag_hourlymaster TA2
WHERE TA1.abortcodeid=TA2.abortcodeid AND ( TA2.applianceid=1 )

Queries with wrong result:

SELECT count(*) AS COL3, SUM(TA2.refcount)
FROM ag_abortcode TA1, v_ag_hourlymaster TA2
WHERE TA1.abortcodeid=TA2.abortcodeid AND ( TA2.applianceid=1 )
GROUP BY TA1.abortcode, TA1.description

SELECT TA1.abortcode AS COL1, TA1.description AS COL2, SUM(TA2.refcount) AS COL3
FROM ag_abortcode TA1, v_ag_hourlymaster TA2
WHERE TA1.abortcodeid=TA2.abortcodeid AND ( TA2.applianceid=1 )
GROUP BY TA1.abortcode, TA1.description

Note that the join is done on a column that is set to a const value in the view.

The error occurs as soon as there are two GROUP BY columns.

Another possible cause might be that statistics may be updated during bulk insert of data into ag_hourlymaster.

I wasn't able to reproduce this by just dumping the two tables and loading them into an empty database. I'm still searching for a simple test case.

Suggested fix:
The easiest fix would probably to disable JOIN_OPERATOR_IMPLEMENTATION and leave it at 'IMPROVED'. However, I don't know what side effects that would have.
[30 Nov 2005 8:29] Robert Klemme
DDL of tables and views

Attachment: ddl.sql (application/octet-stream, text), 4.99 KiB.

[30 Nov 2005 8:30] Robert Klemme
Queries

Attachment: query.sql (application/octet-stream, text), 589 bytes.

[2 Dec 2005 17:23] C.J. Adams-Collier
Hello Robert,

What is required to reproduce the issue?  Does it only happen after the database has been in heavy use?  Will it happen right after you create an instance?  Have you reproduced this problem on more than one system?  I'll plug your SQL into the loader and see if I can duplicate the errors.  It may take a couple of days, however, as I have some high priority issues on my plate right now.

Cheers,

C.J.
[3 Jan 2006 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".