technisches

Thursday, 9. March 2006

thomas fuchs workshop in vienna

-> http://www.wollzelle.com/seminare/ajax-richtig-verwenden
-> http://mir.aculo.us/...-hurry-still-some-places-left

wäre ja super super interessant. aber einfach um 650€ zu teuer. schade. bin ich froh, dass matthias seine internen schulungen gratis waren :-)

Friday, 17. February 2006

SQL Log File Analysis

Quite some time ago i provided a patch for Helma which allows finer-grained SQL Log File Analysis, by recording the time, the table, the type and the sql itself. Back then i quickly calculated some stats for twoday.net:
Type of DB-Requests: 98% Selects, 1,9% Updates, 0,1% Inserts, 0,01% Deletes
Used Tables: 53% AV_TEXT, 25% AV_SKIN, 14% AV_IMAGE, 8% others
40 SQL-Statements per second


Well regarding the distribution not much has changed. What has changed is, that we are now seeing about 200-300 SQLs per second, despite an object-cachesize of 120.000. So, we thought it was about time to take a closer look at the statements themselves. And this time, not the relative number of requests, but the relative amount of time spent was being looked at.
R> round(tapply(data[,3], data[,1], sum) / sum(data[,3]), 3)
SELECT_ACCESSNAMES         SELECT_ALL       SELECT_BYKEY  SELECT_BYRELATION         SELECT_IDS         SELECT_MAX    SELECT_PREFETCH 
            0.001              0.000              0.071              0.161              0.731              0.001              0.034 
-> A surprisingly high amount of time is being spent on fetching ids, i.e. queries, that should already have been optimized by db-indices.
R> round(tapply(data[,3], data[,2], sum) / sum(data[,3]), 3)
   AV_CHOICE       AV_FILE      AV_IMAGE     AV_LAYOUT AV_MEMBERSHIP     AV_MODULE       AV_POLL       AV_SITE       AV_SKIN     AV_SYSLOG       AV_TEXT       AV_USER	AV_VOTE
       0.001         0.001         0.077         0.006         0.015         0.000         0.000         0.065         0.085         0.001         0.685         0.065	0.000 

R> round(tapply(data[,3], data[,1:2], sum) / sum(data[,3]), 3)
                   table
type                AV_CHOICE AV_FILE AV_IMAGE AV_LAYOUT AV_MEMBERSHIP AV_MODULE AV_POLL AV_SITE AV_SKIN AV_SYSLOG AV_TEXT AV_USER AV_VOTE
 SELECT_ACCESSNAMES        NA   0.000    0.001     0.000         0.000        NA      NA      NA   0.000        NA      NA      NA      NA
 SELECT_ALL                NA      NA       NA        NA            NA        NA      NA      NA   0.000        NA      NA      NA      NA
 SELECT_BYKEY               0      NA    0.007     0.006         0.010         0       0   0.003   0.000        NA   0.032   0.013      NA
 SELECT_BYRELATION         NA   0.001    0.046     0.000         0.003         0      NA   0.012   0.071        NA   0.001   0.027       0
 SELECT_IDS                 0   0.000    0.023     0.000         0.002        NA       0   0.049   0.014     0.001   0.617   0.025       0
 SELECT_MAX                NA      NA    0.000     0.000         0.000        NA      NA   0.000   0.000     0.000   0.001   0.000      NA
 SELECT_PREFETCH           NA      NA    0.000        NA            NA        NA      NA   0.000      NA        NA   0.034      NA      NA
-> So, the biggest part of sql-time is being used for fetching IDs from the AV_TEXT-table.
-> And suprisingly already on second place is the fetching of Skins, which is especially surprising since 75% of our users have never touched a single skin.
-> On fourth place we have the fetching of images via their name.

So, which are the overall top sql-statements regarding computation time?
R> sort(tapply(data[,3], data[,4], sum), dec=T)[1:30]
286699 ms	0.178	SELECT AV_TEXT.TEXT_TOPIC FROM AV_TEXT WHERE AV_TEXT.TEXT_F_SITE  AND (TEXT_PROTOTYPE = Story and TEXT_ISONLINE > 0) GROUP BY TEXT_TOPIC ORDER BY TEXT_TOPIC"                                                                     
209549 ms	0.130	SELECT AV_TEXT.TEXT_DAY FROM AV_TEXT WHERE AV_TEXT.TEXT_F_SITE  AND (TEXT_PROTOTYPE = Story and TEXT_ISONLINE ) GROUP BY TEXT_DAY ORDER BY TEXT_DAY desc"                                                                         
136830 ms	0.085	SELECT AV_TEXT.TEXT_ID FROM AV_TEXT WHERE AV_TEXT.TEXT_F_SITE  AND (TEXT_PROTOTYPE = Story AND TEXT_ISONLINE > 0) ORDER BY TEXT_CREATETIME DESC"                                                                                  
92191 ms	0.057	SELECT AV_TEXT.TEXT_ID FROM AV_TEXT WHERE AV_TEXT.TEXT_F_TEXT_STORY  AND (TEXT_PROTOTYPE=Comment) ORDER BY TEXT_MODIFYTIME DESC"                                                                                                  
74457 ms	0.046	SELECT AV_SITE.SITE_ID FROM AV_SITE WHERE (SITE_ISONLINE > 0 AND SITE_ISBLOCKED  AND SITE_SHOW ) ORDER BY SITE_LASTUPDATE desc"                                                                   
63238 ms	0.039	SELECT AV_TEXT.TEXT_ID FROM AV_TEXT WHERE AV_TEXT.TEXT_F_SITE  AND (TEXT_ISONLINE > 0 AND (TEXT_PROTOTYPE = Story OR TEXT_PROTOTYPE = Comment)) ORDER BY TEXT_MODIFYTIME DESC"                                                    
56952 ms	0.035	SELECT AV_TEXT.TEXT_ID FROM AV_TEXT WHERE AV_TEXT.TEXT_F_SITE  AND AV_TEXT.TEXT_DAY = XXX AND (TEXT_PROTOTYPE = Story and TEXT_ISONLINE ) ORDER BY TEXT_CREATETIME desc"                                                          
49750 ms	0.031	SELECT AV_TEXT.* FROM AV_TEXT WHERE AV_TEXT.TEXT_ID "                                                                                                                                                                             
47139 ms	0.029	SELECT AV_IMAGE.* FROM AV_IMAGE  WHERE AV_IMAGE.IMAGE_ALIAS = III AND AV_IMAGE.IMAGE_F_SITE  AND (IMAGE_PROTOTYPE = Image and IMAGE_F_IMAGE_PARENT is null)"                                                                      
41664 ms	0.026	SELECT AV_USER.* FROM AV_USER  WHERE AV_USER.USER_NAME = UUU AND (USER_AUTH_TYPE = local AND USER_NAME IS NOT NULL)"                                                                                                              
-> Oha! Big surprise (at least for me). 30% of the database time is used for determining the "dynamic" HopObjects Day and Topic, which are generated by grouping over all stories of a site. 30%! Well, this should be reason enough to normalize/separate them into distinct tables.
-> The second insight was, that we dont add layouts with no skins to the res.skinpath anymore (which saves us from lots of selects to the skin-table)
-> And thirdly, we might want to consider to directly insert the img-tags into a story, instead of the image-macros, as we already did for weblife. Your WYSIWYG-editor will also thank you.

JavaScript Libraries

SitePoint gives a comprehensive overview over some of the better-known JavaScript Libraries:

-> http://www.sitepoint.com/.../javascript-libraries-and-patterns-yahoo-does-ajax

on Dojo: "The Rolls Royce of JavaScript libraries."

on Prototype: "Because Prototype is so good at making low-level scripting less painful, a number of hihger-level libraries have been built with Prototype as a basis: Most notably: script.aculo.us and Rico"

on AjaxTK: "With the recent announcement that AjaxTK would be contributed as the foundation for Apache Kabuki, an open source AJAX toolkit, its future is looking brighter."

on Yahoo! UI Library: "If there is one thing to love about this library, it’s the documentation. From day one, every available component has full API documentation as well as a short “Getting Started” guide complete with working examples."

Tuesday, 14. February 2006

Yahoo Design Patterns

punkt 3 in meiner liste "what web2.0 is NOT about" lautete Open-Source. Zumindest haben es bist jetzt alle großen service-provider geschafft, ihre technologien, ihre Software gut unter verschluß zu halten. Aber vielleicht ändert sich auch das noch. Yahoo geht jedenfalls einen Riesenschritt und veröffentlicht ihre UI Ajax/DHTML-Library als auch ihre Design Patterns. Ersteres ist ja angesichts der auf prototype.js-basierenden Libraries nicht mehr so weltbewegend, aber zweiteres halte ich doch für sensationell. Beispiel gefällig?
-> Object Pagination
-> Search Pagination
-> Auto-Complete
und und und...

[via zawodny]

Saturday, 11. February 2006

BreakIterator

kannte ich gar nicht:
-> http://java.sun.com/j2se/1.4.2/docs/api/java/text/BreakIterator.html

Der Break-Iterator dient zum Unterteilen eines Textes in einzelne Buchstaben, Wörter, Zeilen und Sätze. Und ja, das funktioniert auch für solche తెలుగు und solche संस्कृतम् Schriften/Sprachen.

Tuesday, 24. January 2006

mod_jk vs mod_proxy

jo, wöchenes nemma denn jetzt?

In jüngster Zeit meldeten twoday-user dass sie auf Links klickten, und dann aber eine ganz andere Seite (bzw öfters überhaupt keine Seite) zurückgespielt wurde. Bei der Suche nach dem Schuldigen kommen ja an und für sich viele Kandidaten in Frage: apache, mod_jk, jetty, helma und natürlich auch die jdk. Nun, diesmal hab ich mir mod_jk vorgeknöpft.

mod_jk liesse sich durch mod_proxy ersetzen. Was die performance angeht war man sich nie so ganz sicher welches die bessere Wahl ist. Ein wenig Googlen liefert derzeit folgende Antworten auf diese Frage:

mod_proxy_http is almost always slower than a properly configured mod_jk
(due to the lack of persistant connections). The work is to get a 'properly
configured mod_jk' ;-).

Personally, I like mod_proxy_ajp, just for the integrated configuration options. The speed should be comperable to mod_jk, but I confess that I haven't actually run benchmarks on it.
[more]

but using AJP does perform more requests/sec because it uses less CPU over using mod_proxy HTTP style connectors to Tomcat [more]

Sprich, ein klares Statement für mod_jk, aber andererseits bring es auch das brandneue mod_proxy_ajp-Modul ins Spiel. Leider leider gibt es noch immer kein debian-package für Apache2.2, und jegliche Versuche des Selbstkompilierens und Konfigurieren dieses Moduls sind bisher gescheitert. Diesbzgl heisst es also abwarten.

Bleiben wir also bei mod_jk, und bringen es eben auf den aktuellsten Stand. Tja denkste: Session-Cookies werden nun vom Opera 8.5 nicht mehr akzeptiert! Siehe auch hier. Sprich wieder alles retour, kurz durchschnaufen, und weiter überlegen.

Nachtrag: so der entsprechende bug-report wäre jetzt auch erledigt -> link. Vielleicht gibt es ja dann dort mehr Informationen dazu.

Thursday, 12. January 2006

upcoming xml-features in MySQL 5.1

select extractValue(doc,'/book/child::*') from x;
select UpdateXML(doc,'/book/author/initial','mp') from x;


[more]

Tuesday, 10. January 2006

Matrix Market

Matrices, matrices, anybody needs matrices?
-> http://math.nist.gov/MatrixMarket/

Interestingly they offer matrices from the field of "Nuclear Reactor Design", but nothing related to "Document Analysis". weird.

Tuesday, 3. January 2006

twoday 1.0.0 feedback?

I was very delighted/pleased with the general response to my/our release of twoday 1.0.0. And i am now even more convinced that it was the right thing to do.
According to the sourceforge-stats, there have been
138 Downloads
so far! What i was really looking forward, was receiving some feedback on how well (or not) the installation went along. First i got a little irritated by receiving none at all, but since some actually did manage to install their own little twoday (and write about it), it seems that no feedback is actually good feedback. so, there shouldn't be much in the way of declaring twoday 1.0.x stable pretty soon.

btw: Very interesting was of course how the story got picked up by the blogoshpere, and tracing that story through the blogosphere. Golem wrote a short (German) summary of my blog entry, probably after reading the according message by dkg on pro-linux.de. From there quite a large number of blogs re-posted the golem-story, most of them verbatim.

Thursday, 22. December 2005

twoday 1.0.0 (beta)

As I have already announced last week at the helma-dev we will provide twoday (pretty much as it can currently be enjoyed here at twoday.net) under an Open-Source-license (BSD-style!), and plan to "establish an active user and developer community around the twoday-software". A first step into this direction is the release of twoday-1_0_0beta, the setup of the according sourceforge.net-project, and the setup of the twoday.org-wiki.

We already made a similar announcement over a year ago, where we planned on releasing twoday under the GPL, which we actually never did. This time we are more aware of the consequences, and will should have time to maintain and support users, as well as developers.

This is a pretty big step for us. Twoday became a core business for us over the past two years, and we invest quite some effort into its ongoing development. Projects like twoday.net, weblife.at, moday.at, twoday.tuwien.ac.at,.. should speak for themselves. Nevertheless i am fully convinced that this is the right step, and am very happy to work in an environment where my partners share the same opinion on Open-Source.

So, go ahead and start your blog community today! (urgh, saying this actually did hurt a bit :-)) No matter if its just for you, for your school/university, for your company, or whether you want to beat twoday.net regarding free hosting for everyone.
Well, actually better wait till twoday 1.0.x is declared stable, which should be soon, after receiving some first feedback from this release.

-> http://twoday.org
Still under construction (as every wiki) but at least the start page looks fine to me.

Search

 

About michi

michi Michi a.k.a. 'Michael Platzer' is one of the Knallgraus, a Vienna-based New Media Agency, that deals more and more with 'stuff' that is commonly termed as Social Software.

Meet my fellow bloggers at Planet Knallgrau.

my delicious

Recent Updates

My Gadgets

Credits

Knallgrau New Media Solutions - Web Agentur f�r neue Medien

powered by Antville powered by Helma


Creative Commons License

xml version of this page
xml version of this page (summary)
xml version of this topic

twoday.net AGB

Counter



berufliches
blogosphaerisches
privates
spassiges
sportliches
technisches
trauriges
Profil
Logout
Subscribe Weblog