Archive for May, 2007

Flash video not loading

Friday, May 25th, 2007

My colleagues have just found out that when loading a Flash video file into Flash, the URL must end with “.flv”.

Flash doesn’t check the type – if the URL does not end with “.flv” it won’t even request the file. This could lead to an endless battle between the Flash developer and the web developer because the static version would work and the online version wouldn’t. The Flash developer requests a static .flv file on the disk but when the web developer changes the request to a CMS based file the video doesn’t load.

So, when using a CMS what you’d do is add a “&ext=.flv” or even just “&.flv” to the CMS provided URL (eg. /loadContent.php?id=123) for it to load. You can do that in Flash or in the code that passes the video URL to the player.


Update: While checking other sites that load dynamic .flv files I found out that it might just look if the “.flv” is in the URL. Still funny though.

Odprti kop / Strip Mine launch (live blog)

Wednesday, May 23rd, 2007

I’m in Cyberpipe, liveblogging the launch of Strip Mine, a service that adds searching capability to video that Slovenian national television produces. It also allows bloggers to link to such content – the exact second of it – to blogs….

For example you could do this:

Miljonar: 12
Milijonar: 12 Kličejo vas tudi Miško. Ja. To me že od malih let tako kličejo, ne vem zakaj. Mojimir, Miško. Mojimir Miško Cilenše… Vir: RTV Slovenija

Andraž is now talking more about the language technologies that help users find data on the site…

The beta service iswas available at Check update #1.

The company that Andraž and Boštjan founded is located at

The service know hos to link content to relevant sources – currently only Wikipedia. It knows some not to link names because linking firstnames doesn’t really makes sense.

What it also does is find what the tv show is talking about and is thus able to link to pages in the portal that also talk about the same things.

It uses Lucene/PyLucene as the indexer, Python, MySQL, Apache, Django, Snowball/PyStemmer for stemming, Lematizator.

Users will also find that the content will only be added as is created. This means that all the locally produced and subtitled.
The content will be there until the videos are available, the quoting will work until the service is online.

APIs might come but are not there yet. The intention to create them was there but wwere not yet created since noboby would be using them. This means that we will probably be able to co-create the APIs as we use them. Comment on the services blog…

The service is completely open. It’s not fully featured, it’s “beta” but not beta. A lot of things are yet to come, but the service will be launched today anyway.

Andraž is saying that these kind of services are popping up, not yet in the big media business though.

He’s now talking more specifically about language technologies. The computer knows how to read but might never understand what it’s reading. Faking the understanding gives us possibilities to enhance the experience when viewing content. As opposed to the Semantic Web this approach seems to be more practical in its core.

The case of better experience is LinkedIn – currently it’s a passive service that you have to use. Andraž would like it to be active – search the callendar, find contacts, arrange meetings and just communicate this to the user. “I’m feeling lucky” services.

Interactivity – are we sure we want this from the computer? Don’t we just want it to be a better servant? Let it get orders and make decisions by itself.


These problems are really hard to tackle. Not only it’s hard to do this in an easy language – we’re in Slovenia. What can we really do?

How about a service that automatically finds pictures for your current blog post. Maybe even process it so it fits your design?

Andraž is now joined by the cofounder and CEO of Zemanta, Boštjan Špetič, that is talking about what they’re gonna work on. He also thanked Zvezdan Matrič, MMC and the Cyberpipe. The ideas are longterm and there is no end to the possibilities that these kind of technologies provide.

And now the Q&A:
Q: reverse engineering the .890 subtitles format.
A: HEX editor to find the blocks and then trying to decode the encoding for the letters

Q: processing power
A: it takes two hours to process – one hour to download and one hour to process. Slovenian Wikipedia is small enough to hold in RAM to process faster. Linking to Rtvslo news is scaled down to about 20.000 articles. They’re linked to every paragraph.

Q: will the paragraphing heuristicts change
A: no, probably not.

Q: how much information is already in the subtitles? timestamps?
A: all timestamps are based on the subtitles, could be done without in the future.

Q: did the subtitling process change?
A: no

Q: what about the future? will there be voice recognition?
A: focus is on smart processing, there might be voice recognition to define stuff.

Q: all automatic?
A: yes. no manual changing is done currently.

Q: will there be?
A: it’s possible. maybe in the future the journalists will change the data.

Q: how much time was spent and what was the plan?
A: experimental from october 2006. rewritten for production once the prototype was done. the service gradually got bigger as it was developed.

Q: how about the pictures?
A: tricks. we know the beginning and the end of the paragraph. the interval usually contains about 5000 frames. which one to take? you take an image with a certain JPEG size – smaller are too blurry, bigger are also not good.

Q: is all text online?
A: everything is in the show except the prebuild subtitles.

Q: congrats. web has video with no subtitles. what are you planning to do in the future since you’re in a niche market?
A: subtitling of videos is expensive so we’re not going there. we’ll be in the niched markets – blogging, finding already tagged content,..

And we’re done.

Update #1:
As the service already launched it’s available at


Wednesday, May 23rd, 2007

Washing cars seems to be very popular among popular people lately. My car usually needs a fresh car wash so when you’re making a video about it please contact me.

The tables – part 2

Tuesday, May 22nd, 2007

In The tables – part 1 I wrote about how the basic table elements should be used in a POSH document.

What we have until now is shown in example 3:
<table summary=”The steady growth of the company revenues in the last 10 years”>
<caption>Company revenues in the last 10 years</caption>

As promised, we’ll now tackle are the <colgroup> and the <col> tags.

Column group

When you look at tabular data it doesn’t take long to figure out that the data is grouped in two different ways – rows and columns. If you’ve ever produced a table in HTML you’ll know that you create it by creating rows and then creating cells in them. This means that accessing data by columns is not really all that easy. What you do is go through all the rows and get the cells of the correct index and retrieve the data. The same goes for styling cells in a specific column.

Setting a table cell to align the text to the right would commonly be achieved by assigning it a certain class name that would be associated with a CSS rule that would align the text to the right [1]. When applying the same style to the whole column the intuitive thing would be to add that style to all the cells in the column.

But! Here comes the <colgroup> element.

The semantic meaning of the <colgroup> elements the grouping of columns into groups. Not surprising. What you can get from this is the possibility to style these columns. The colgroup element has a few available attributes:

  1. span sets how many columns are in the colgroup. It defaults to 1,
  2. width sets the width of the whole group,
  3. align sets the text-align of all the cells in the column,
  4. valign sets the vertical-align of all the cells in the column,

Span is only used if the <colgroup> elements contains no <col> elements – in that case the number of columns is calculated from the spans of the child <col> elements. The width contains the width of the columns included in the <colgroup> element but is overridden by widths of the child <col> elements if present. It may contain a special value of 0* which means that the element should be as small as possible but enough to contain all the content.

Two additional attributes are present in the standard – char and charoff but support for them is not required by the standard so we’re not going to go into them even though their use could be useful in some cases.


To specify a column without grouping columns into groups you can use the <col> element. You can also use <col> element to specify special columns inside a column group. If you use one directly inside the <table> you cannot use the other.

The attributes for the element are the same as for the <colgroup> element. They’re also used in the same way.


<colgroup> and <col> seem like a solution to a lot of problems when styling tables. You often want a whole column to be aligned to right, to have a certain background or be in a specific color. You might even want it to have a different border. All this could be done with a styling of the class name or the id of these elements.

Unfortunately all this only works in Internet Explorer. It doesn’t work in Firefox 2.0 (none of it) or Opera9.2 (only align attribute works, styling with CSS doesn’t). I don’t know if it works in Safari but that doesn’t really matter since the later is enough of a reason not to use this for styling. The only thing that is useful (it works in all the browsers tested) is the width attribute.

You can check all this in the example #4.

Next time…

In the next edition of the tables, hopefully less then two months away, we’ll check the content grouping tags, the thead, tbody and tfoot.

  1. No, align=”right” would not be ok and style=”text-align:right;” is also not the best idea due to the problems of changing this style and applying it to more cells. back

WordPress AutoSave

Monday, May 21st, 2007

I’ve created a Greasemonkey script (and a Firefox Add-on) that adds an unobtrusive AutoSave feature to WordPress 2.0.2 and probably all the other WordPress versions 2.x. Could be that it even works with previous WordPress versions.

I decided I need to write the script after I lost a half of a post two times in a row because I accidentally pressed the ‘Back’ button on my new ThinkPad keyboard. This annoyed me so much I went to check if there are AutoSave plugins available. When I didn’t find any (one had the page unavailable, the other possibility was upgrading to 2.2) I decided to write a GM script. It’s pretty easy and as I wrote it the functionality enhanced itself. It’s useful for me, it might also be useful for you.

More about the WordPress AutoSave script on its own page.

The semantics of <small> and a POSH pattern for footnotes

Sunday, May 20th, 2007

It was probably more than a month ago that markos asked me about the semantics of less important items. We had a short discussion about it and found no relevant tags to mark up a part of text that was less important than the rest of it. An easy example would be a footnote, legal text or any other stuff you would usually make smaller in the world of looks over semantics.

A few days ago I was reminded that I forgot to write about it back then when I saw this issue resurface on the WSG mailing list.

When the semantics of HTML were on question a lot of the tags were ‘deprecated’. Not all marked deprecated in the standard itself, but rather marked as bad practice in the web standards community [1]. When trying to tell the client something is important you really should not be telling it to show it in bold typeface – you have CSS to do that [2].

The problem is that when all these presentational elements were ‘killed’ somebody wasn’t thinking. Let’s see:

  1. <b> ‘deprecated’ in favor of <strong>
  2. <big> ‘deprecated’ in favor of ?
  3. <br> discouraged in favor of <p>
  4. <i> ‘deprecated’ in favor of <em>
  5. <s> and <strike> deprecated in favor of <del>
  6. <small> ‘deprecated’ in favor of ?
  7. <tt> ‘deprecated’ in favor of <code>
  8. <u> deprecated in favor of <ins> and because of confusion with <a>

You might have noticed the question marks in the list. The first one, the tag that is supposed to be a semantic for <big> isn’t really all that important. We have many ways to point out that something is important (if there was a semantic meaning – <h1>…<h6>, <strong>, <em>) or just use CSS to change it to big. The problem lies in the latter question mark. How do you mark something that used the small as some sort of semantic and not just a way of presenting the data visually?

In the standard these are actually specified in the Graphics part of it. The meaning of <small> is Renders text in a “small” font. That’s great, but what if I want to tell the world that what I put in there is a legal text? Footnote? Something deemphasized? POSH patterns to the rescue.

As you might have noticed this post uses footnotes. I’ve marked them so that the text in the article links to the footnote bellow and the footnote links back. To show that this is a footnote relation I’ve added a forward relation rel=”footnote” to the link in the text and a reverse relation rev=”footnote” in the footnote itself. I’m also using these to set the styles (which makes them break in IE6 and other stinky browsers). The footnotes are marked up as an ordered list (<ol>) with a class name “footnotes” and each footnote is a list item (<li>) which has an id “footnote-footnoteid-postid” that enables me to link to it.

When marking up legal notices you might want to use rel=”license” and link it to the part of the content you’re specifying the legal text for with a rev=”license” if you don’t have a link to it; if you’re specifying it for the document just link to the page.

  1. Some are deprecated in HTML 4.01, removed in XHTML 1.0 Strict and some just marked as bad practice. For more details check the specification or other sources. back
  2. If you want more information why this is the way to go please read the POSH page and the articles and presentations linked on that page. back