Friday, 17. February 2006

SQL Log File Analysis

Quite some time ago i provided a patch for Helma which allows finer-grained SQL Log File Analysis, by recording the time, the table, the type and the sql itself. Back then i quickly calculated some stats for twoday.net:
Type of DB-Requests: 98% Selects, 1,9% Updates, 0,1% Inserts, 0,01% Deletes
Used Tables: 53% AV_TEXT, 25% AV_SKIN, 14% AV_IMAGE, 8% others
40 SQL-Statements per second


Well regarding the distribution not much has changed. What has changed is, that we are now seeing about 200-300 SQLs per second, despite an object-cachesize of 120.000. So, we thought it was about time to take a closer look at the statements themselves. And this time, not the relative number of requests, but the relative amount of time spent was being looked at.
R> round(tapply(data[,3], data[,1], sum) / sum(data[,3]), 3)
SELECT_ACCESSNAMES         SELECT_ALL       SELECT_BYKEY  SELECT_BYRELATION         SELECT_IDS         SELECT_MAX    SELECT_PREFETCH 
            0.001              0.000              0.071              0.161              0.731              0.001              0.034 
-> A surprisingly high amount of time is being spent on fetching ids, i.e. queries, that should already have been optimized by db-indices.
R> round(tapply(data[,3], data[,2], sum) / sum(data[,3]), 3)
   AV_CHOICE       AV_FILE      AV_IMAGE     AV_LAYOUT AV_MEMBERSHIP     AV_MODULE       AV_POLL       AV_SITE       AV_SKIN     AV_SYSLOG       AV_TEXT       AV_USER	AV_VOTE
       0.001         0.001         0.077         0.006         0.015         0.000         0.000         0.065         0.085         0.001         0.685         0.065	0.000 

R> round(tapply(data[,3], data[,1:2], sum) / sum(data[,3]), 3)
                   table
type                AV_CHOICE AV_FILE AV_IMAGE AV_LAYOUT AV_MEMBERSHIP AV_MODULE AV_POLL AV_SITE AV_SKIN AV_SYSLOG AV_TEXT AV_USER AV_VOTE
 SELECT_ACCESSNAMES        NA   0.000    0.001     0.000         0.000        NA      NA      NA   0.000        NA      NA      NA      NA
 SELECT_ALL                NA      NA       NA        NA            NA        NA      NA      NA   0.000        NA      NA      NA      NA
 SELECT_BYKEY               0      NA    0.007     0.006         0.010         0       0   0.003   0.000        NA   0.032   0.013      NA
 SELECT_BYRELATION         NA   0.001    0.046     0.000         0.003         0      NA   0.012   0.071        NA   0.001   0.027       0
 SELECT_IDS                 0   0.000    0.023     0.000         0.002        NA       0   0.049   0.014     0.001   0.617   0.025       0
 SELECT_MAX                NA      NA    0.000     0.000         0.000        NA      NA   0.000   0.000     0.000   0.001   0.000      NA
 SELECT_PREFETCH           NA      NA    0.000        NA            NA        NA      NA   0.000      NA        NA   0.034      NA      NA
-> So, the biggest part of sql-time is being used for fetching IDs from the AV_TEXT-table.
-> And suprisingly already on second place is the fetching of Skins, which is especially surprising since 75% of our users have never touched a single skin.
-> On fourth place we have the fetching of images via their name.

So, which are the overall top sql-statements regarding computation time?
R> sort(tapply(data[,3], data[,4], sum), dec=T)[1:30]
286699 ms	0.178	SELECT AV_TEXT.TEXT_TOPIC FROM AV_TEXT WHERE AV_TEXT.TEXT_F_SITE  AND (TEXT_PROTOTYPE = Story and TEXT_ISONLINE > 0) GROUP BY TEXT_TOPIC ORDER BY TEXT_TOPIC"                                                                     
209549 ms	0.130	SELECT AV_TEXT.TEXT_DAY FROM AV_TEXT WHERE AV_TEXT.TEXT_F_SITE  AND (TEXT_PROTOTYPE = Story and TEXT_ISONLINE ) GROUP BY TEXT_DAY ORDER BY TEXT_DAY desc"                                                                         
136830 ms	0.085	SELECT AV_TEXT.TEXT_ID FROM AV_TEXT WHERE AV_TEXT.TEXT_F_SITE  AND (TEXT_PROTOTYPE = Story AND TEXT_ISONLINE > 0) ORDER BY TEXT_CREATETIME DESC"                                                                                  
92191 ms	0.057	SELECT AV_TEXT.TEXT_ID FROM AV_TEXT WHERE AV_TEXT.TEXT_F_TEXT_STORY  AND (TEXT_PROTOTYPE=Comment) ORDER BY TEXT_MODIFYTIME DESC"                                                                                                  
74457 ms	0.046	SELECT AV_SITE.SITE_ID FROM AV_SITE WHERE (SITE_ISONLINE > 0 AND SITE_ISBLOCKED  AND SITE_SHOW ) ORDER BY SITE_LASTUPDATE desc"                                                                   
63238 ms	0.039	SELECT AV_TEXT.TEXT_ID FROM AV_TEXT WHERE AV_TEXT.TEXT_F_SITE  AND (TEXT_ISONLINE > 0 AND (TEXT_PROTOTYPE = Story OR TEXT_PROTOTYPE = Comment)) ORDER BY TEXT_MODIFYTIME DESC"                                                    
56952 ms	0.035	SELECT AV_TEXT.TEXT_ID FROM AV_TEXT WHERE AV_TEXT.TEXT_F_SITE  AND AV_TEXT.TEXT_DAY = XXX AND (TEXT_PROTOTYPE = Story and TEXT_ISONLINE ) ORDER BY TEXT_CREATETIME desc"                                                          
49750 ms	0.031	SELECT AV_TEXT.* FROM AV_TEXT WHERE AV_TEXT.TEXT_ID "                                                                                                                                                                             
47139 ms	0.029	SELECT AV_IMAGE.* FROM AV_IMAGE  WHERE AV_IMAGE.IMAGE_ALIAS = III AND AV_IMAGE.IMAGE_F_SITE  AND (IMAGE_PROTOTYPE = Image and IMAGE_F_IMAGE_PARENT is null)"                                                                      
41664 ms	0.026	SELECT AV_USER.* FROM AV_USER  WHERE AV_USER.USER_NAME = UUU AND (USER_AUTH_TYPE = local AND USER_NAME IS NOT NULL)"                                                                                                              
-> Oha! Big surprise (at least for me). 30% of the database time is used for determining the "dynamic" HopObjects Day and Topic, which are generated by grouping over all stories of a site. 30%! Well, this should be reason enough to normalize/separate them into distinct tables.
-> The second insight was, that we dont add layouts with no skins to the res.skinpath anymore (which saves us from lots of selects to the skin-table)
-> And thirdly, we might want to consider to directly insert the img-tags into a story, instead of the image-macros, as we already did for weblife. Your WYSIWYG-editor will also thank you.

JavaScript Libraries

SitePoint gives a comprehensive overview over some of the better-known JavaScript Libraries:

-> http://www.sitepoint.com/.../javascript-libraries-and-patterns-yahoo-does-ajax

on Dojo: "The Rolls Royce of JavaScript libraries."

on Prototype: "Because Prototype is so good at making low-level scripting less painful, a number of hihger-level libraries have been built with Prototype as a basis: Most notably: script.aculo.us and Rico"

on AjaxTK: "With the recent announcement that AjaxTK would be contributed as the foundation for Apache Kabuki, an open source AJAX toolkit, its future is looking brighter."

on Yahoo! UI Library: "If there is one thing to love about this library, it’s the documentation. From day one, every available component has full API documentation as well as a short “Getting Started” guide complete with working examples."

Search

 

About michi

michi Michi a.k.a. 'Michael Platzer' is one of the Knallgraus, a Vienna-based New Media Agency, that deals more and more with 'stuff' that is commonly termed as Social Software.

Meet my fellow bloggers at Planet Knallgrau.

my delicious

Recent Updates

My Gadgets

Credits

Knallgrau New Media Solutions - Web Agentur f�r neue Medien

powered by Antville powered by Helma


Creative Commons License

xml version of this page
xml version of this page (summary)

twoday.net AGB

Counter



berufliches
blogosphaerisches
privates
spassiges
sportliches
technisches
trauriges
Profil
Logout
Subscribe Weblog