Tuesday, 25. April 2006

russkaja

letztens im klub ost. ska vom feinsten.



Eigentlich ist dieses Posting eher ein kleiner Test von sevenload.de, ein fein gemachtes Service aus D, derzeit aber noch nicht online!? Aber da ich auch paar Videos auf meinem Rechner rumliegen hab, dachte ich mir poste ich doch mal was vernünftiges :-)

related:
-> Ten video sharing services compared

multi-tasking overload

-> I can't concentrate. Can you?

jeremy bringt es auf den punkt.

Thursday, 20. April 2006

MySQL linkdump

as a follow-up to my previous post, here are some more links:

-> http://meta.wikimedia.org/wiki/Wikimedia_servers#Server_list
1 active master, 6 active slaves; most of the machines having fast disks, RAID-0 and 16GB RAM

-> Status of all Wikipedia MySQL Servers
-> Status of a single Wikipedia MySQL Slave
no swapping! very low "cached" memory number, indicating that InnoDB behaves completely different than MyIsam (resp our setup) regarding disk caching. average system-load is below 1!

memory usage with our setup:
mem-month

-> Brad Fitzpatrick's Notes on LiveJournal at the MySQLcon 2005
(note, you might have already read his notes from 2004, but these slides are updated and provide some more valuable infos)
* "User-Clusters", i.e. livejournal-users are split into several clusters.
* master-master replication-setup
* Use InnoDB. Really. Little bit more config work, but worth it...fast as hell.

Friday, 14. April 2006

MySQL Troubles at twoday.net

To say the least, i've gone through hell the last couple of days :-)
It started monday night, when i once again tried to nail down the performance troubles, which we experienced here at twoday.net. Performance tuning requires knowledge and experience in so many different areas (hardware, debian, mysql-db, helma, twoday), that this task is still quite often attached to me within the company.

So, Monday night I could quickly identify with the help of helma's sql files, that the troubles solely reside at the database-side, and that all other troubles (the infamous "maximum thread count reached"-message) were just symptoms of a slow database-server. I started out with the assumption, that the db-indexes were not working correct anymore, since some of the standard sql-statements took an insane amount of time (~8sec). So i tried to rebuild these indexes, to add some new and remove some unused ones. An iterative task, that takes hours with large databases. Somewhere around 4a.m. i got the deceiving impression, that everything was working fine again, and i went to sleep.

Well, just for a couple of hours, since I was awakened by alerts that twoday.net is extremly slow again, and that it is even worse than before. That everything worked at 4a.m. was simply because there is not much traffic at that hour. Panic! Since we switched a while ago to MySQL 5 (from dotdeb.org), i started blaming this move. So i switched to all kind of MySQL 5-binaries from the MySQL-website. I started switching back to MySQL 4.0. Nothing helped at all, the sql-statements still took far too long. And it wasn't just a certain kind of statement, even a simple SELECT count(*) from AV_TEXT took a second. SHOW PROCESSLIST showed that most of the connections were in "locked"-state, i.e. waiting to be processed, and that around 3-4 statements were actually being processed, but taking couple of seconds. As I mentioned a while ago, we have about 200-300 SQL-statements per second, so this meant troubles for sure.

Switching database-versions obviously didn't help, so i started to blame the hardware. I completely moved the database to another server. It didn't help either. Same symptoms. More Panic!

After I read through tons of MySQL-pages (it was already getting night again), I tried fiddling around with all kind of Server System Variables. It all had no effect at all, nada.

Finally I gave in, and started setting up a mysql-cluster, in order to evenly distribute the queries to two servers. Again it was 4a.m., my mind was not working anymore, and i went to sleep. Just to be awakend couple of hours later, to hear that it didn't help either. The cluster-setup was fine, but the queries still took 4secs instead of 0.04secs, so i would have needed a hundred servers.

What really shocked me was, that i was not able to bring back a twoday.net-setup that worked at least slowly. Whatever i tried, ended up in "maximum thread count reached", and lots of unhappy users, who i can fully understand. Generally speaking I am just not the type of guy, who gives up easily. Hey, i finished 6 marathons until now, and they were all not a piece of cake. But this time I gave in.

I went back to the office wednesday morning, with absolutely no idea of what to do next. I started telling matthias and axel the full story. And while telling them everything chronologically, and with them just asking the right questions, and with a sudden common inspiration, it all became obvious! The twoday-database is growing and growing, and now is about 1.7GB big, with the single AV_TEXT-table accouting for most of this space (btw wikipedia's content is not much bigger). Hmm, so we've got 2GB RAM at the db-server (and also on the other machine i tried out). About 300KB of these are used for key-caching, and that's why i always assumed that we have plenty of RAM for the database. But, what is hardly mentioned anywhere, not even in the otherwise excellent O'Reilly book "High Performance SQL", is that it makes a HUGE difference whether the OS is able to cache your data-files or not. With Oracle (according to Axel) this kind of disk-caching is handled by the db-server. With MySQL it is left to the operating system! Therefore you will never find any configuration parameter to adjust this, resp. no mysql status variable indicates the fact, that the OS can't cache your files anymore! I was just never aware of that the full database has always been completely in memory, and as soon as this is not possible anymore, there is just no chance to handle 300 queries per second. And honestly, I blame MySQL (or Zawodny, author of the forementioned book) for not making this point clearer.

Solution 1: Make your database smaller!
Solution 2: Order more RAM!
Solution 3: Start thinking about Partitioning (a new feature in Mysql 5.1)

@1: Thanks to Axel, who gave me the decisive hint, it was damn easy to cut a twoday-database nearly to half of its size. Simply drop the TEXT_RAWCONTENT-column, and perform the search on the TEXT_TEXT-column. That is the reason, why everthing is running so smoothly here at twoday.net again.
@2: Easy solution, but in our case, the RAM will not arrive before middle of next week. So, we have to wait for that, and hope solution 1 is good enough until then.
@3: Next Wednesday there is a (free) web presentation regarding Partitioning, which might be interesting: see here.

So, to sum things up, (painful) lessons lerned from this nightmare:
* Alyways keep your database in memory!
* Start acting more like a team-player than a marathon-runner! :-)

Thursday, 13. April 2006

tag the net !

We are happy to announce that we were able to increase quality and performance of our tag extraction algorithm. The improvements have especially benefited the quality of topics for german texts but also the extracted topics for english texts are even more accurate now. You might think that that's the old "20% better taste!"-blabla but you need not take my word for it. Go and see for yourself.
-> http://www.tagthe.net/blog/stories/1720596/
-> http://tagthe.net

So, in case you have checked out that service a while ago, and thought its speed and/or tag-quality sucks, then please check it out again.

A neat use-case is the delicious-plugin for example, which now became a lot more useful to use than before.
-> http://www.tagthe.net/blog/stories/1342788/

Congrats to the boyz!

Wednesday, 5. April 2006

(Helma) Developer(in) gesucht

Tja, mal wieder. Die Firma wächst und wächst.

Wir bieten:
* Spannendes, dynamisches Betätigungsumfeld
* nette kollegen [1,2], und (manchmal) nette chefs [3]
* einen firmen-eigenen Basketball-Court

Wir suchen:
* Talentierte, smarte, kommunikative,.. Entwickler(in) mit mehrjähriger Erfahrung im Web-Bereich.
* Helma-Erfahrung ist kein K.O.-Kriterium :-)

Interessenten und -innen melden sich am besten mit Lebenslauf bei mir.

Monday, 3. April 2006

8mm filme

Opa ist/war begeisteter Filmer, und hat nun über 50 8mm Film-Spulen ("normale" als auch "Doppel8") á rund 100 Meter. Die frühesten Filme stammen von 1946, und sind nicht nur für uns Verwandten interessant, sondern seltene historische Dokumente dieser Zeit (wobei aber die Aufnahmen damals eher der Unterhaltung als der Dokumentation dienten).

Ein schnelles Googlen brachte für die professionelle Digitalisierung Preise von 0,6€ pro Meter zu Tage. Das sind dann rund 2.500 Euro!!
Mit einer Digitalkamera einfach die Leinwand abfilmen wäre die Alternative dazu, aber die Qualität würde bestimmt darunter leiden. Vor allem reissen die Filme immer wieder, und die Tonqualität der Projektoren ist miserabel.

Vielleicht hat ja jemand schon Erfahrungen mit dieser Thematik gesammelt, und kann mir paar Hinweise/Links zustecken.

Search

 

About michi

michi Michi a.k.a. 'Michael Platzer' is one of the Knallgraus, a Vienna-based New Media Agency, that deals more and more with 'stuff' that is commonly termed as Social Software.

Meet my fellow bloggers at Planet Knallgrau.

my delicious

Recent Updates

My Gadgets

Credits

Knallgrau New Media Solutions - Web Agentur f�r neue Medien

powered by Antville powered by Helma


Creative Commons License

xml version of this page
xml version of this page (summary)

twoday.net AGB

Counter



berufliches
blogosphaerisches
privates
spassiges
sportliches
technisches
trauriges
Profil
Logout
Subscribe Weblog