Bad advertising

June 2nd, 2007

I often come across an ad that seems interesting enough to click it. When I do I’m usually disappointed, because I end up in a place I didn’t expect.

Dear advertisers!

When spending money on my click, if you’re advertising a product please take me to the product page, not your home page. I’m interested in the product, not you. Yes, this kind of teleportation is possible in the online world. The only thing you did is add a mental comment to your URL: “Can’t find what I’m looking for” that will probably make me go away even when I am buying something.

Learn from Victoria’s Secret

Wife swap

June 2nd, 2007

Does anybody know what happened with “Wife swap” reality show in Slovenia? Did it happen and I haven’t noticed or was it canceled due to lack of interest? It was announced last December and was supposed to happen this spring. Did it?

By the way, its homepage was obviously killed – with the show?

Blogorola has ping – Apache rewriting with time

June 2nd, 2007

Had says, that Blogorola got a ping interface at http://api.blogorola.com/ping.
I hope this means that it won’t be requesting the feed every 2 minutes anymore. It’s should be getting a 304, but anyways…

Update #1: I posted a post in Slovenian. I have no idea what I was thinking. And I also figured out that it’s getting a 302, cause my feed is at Feedburner, a Google company

Update #2: Hope floats. Blogorola’s “ItsyBitsy – spider” made 57 requests in the last 8 hours or so. My server doesn’t care much, cause it only serves a 302 and redirects to the FeedBurner hosted feed. What about yours? Are you willing to put up with this?

If you’re using Apache and mod_rewrite (chances are that you are) you can use mod rewrite to make sure the requests don’t go through to your backend and database with something like this:

RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^ItsyBitsy
RewriteCond %{TIME_HOUR} >00
RewriteCond %{TIME_HOUR} <06
RewriteRule . http://blackhole/ [R=307,L]

Add this to the .htaccess file in your WordPress folder (it should already be there) or basically anywhere on the server to disallow access to the spider at all times except at night when other traffic is low.

You could also use this to allow the spider to access the feed only when you're actually writing - e.g. you usually write your posts between 20 and 22 so you can allow access then, and send it to the "blackhole" at other times. You can also use this to server it different feeds at different times for whatever reason...

See more about rewriting with time at the Apache conf page or the rewrite guide.

Flash video not loading

May 25th, 2007

My colleagues have just found out that when loading a Flash video file into Flash, the URL must end with “.flv”.

Flash doesn’t check the type – if the URL does not end with “.flv” it won’t even request the file. This could lead to an endless battle between the Flash developer and the web developer because the static version would work and the online version wouldn’t. The Flash developer requests a static .flv file on the disk but when the web developer changes the request to a CMS based file the video doesn’t load.

So, when using a CMS what you’d do is add a “&ext=.flv” or even just “&.flv” to the CMS provided URL (eg. /loadContent.php?id=123) for it to load. You can do that in Flash or in the code that passes the video URL to the player.

Funny.

Update: While checking other sites that load dynamic .flv files I found out that it might just look if the “.flv” is in the URL. Still funny though.

Odprti kop / Strip Mine launch (live blog)

May 23rd, 2007

I’m in Cyberpipe, liveblogging the launch of Strip Mine, a service that adds searching capability to video that Slovenian national television produces. It also allows bloggers to link to such content – the exact second of it – to blogs….

For example you could do this:

Miljonar: 12
Milijonar: 12 Kličejo vas tudi Miško. Ja. To me že od malih let tako kličejo, ne vem zakaj. Mojimir, Miško. Mojimir Miško Cilenše… Vir: RTV Slovenija

Andraž is now talking more about the language technologies that help users find data on the site…

The beta service iswas available at http://www.rtvslo.si/odprtikop/beta/. Check update #1.

The company that Andraž and Boštjan founded is located at http://www.zemanta.com/

The service know hos to link content to relevant sources – currently only Wikipedia. It knows some not to link names because linking firstnames doesn’t really makes sense.

What it also does is find what the tv show is talking about and is thus able to link to pages in the http://www.rtvslo.si portal that also talk about the same things.

It uses Lucene/PyLucene as the indexer, Python, MySQL, Apache, Django, Snowball/PyStemmer for stemming, Lematizator.

Users will also find that the content will only be added as is created. This means that all the locally produced and subtitled.
The content will be there until the videos are available, the quoting will work until the service is online.

APIs might come but are not there yet. The intention to create them was there but wwere not yet created since noboby would be using them. This means that we will probably be able to co-create the APIs as we use them. Comment on the services blog…

The service is completely open. It’s not fully featured, it’s “beta” but not beta. A lot of things are yet to come, but the service will be launched today anyway.

Andraž is saying that these kind of services are popping up, not yet in the big media business though.

He’s now talking more specifically about language technologies. The computer knows how to read but might never understand what it’s reading. Faking the understanding gives us possibilities to enhance the experience when viewing content. As opposed to the Semantic Web this approach seems to be more practical in its core.

The case of better experience is LinkedIn – currently it’s a passive service that you have to use. Andraž would like it to be active – search the callendar, find contacts, arrange meetings and just communicate this to the user. “I’m feeling lucky” services.

Interactivity – are we sure we want this from the computer? Don’t we just want it to be a better servant? Let it get orders and make decisions by itself.

But.

These problems are really hard to tackle. Not only it’s hard to do this in an easy language – we’re in Slovenia. What can we really do?

How about a service that automatically finds pictures for your current blog post. Maybe even process it so it fits your design?

Andraž is now joined by the cofounder and CEO of Zemanta, Boštjan Špetič, that is talking about what they’re gonna work on. He also thanked Zvezdan Matrič, MMC and the Cyberpipe. The ideas are longterm and there is no end to the possibilities that these kind of technologies provide.

And now the Q&A:
Q: reverse engineering the .890 subtitles format.
A: HEX editor to find the blocks and then trying to decode the encoding for the letters

Q: processing power
A: it takes two hours to process – one hour to download and one hour to process. Slovenian Wikipedia is small enough to hold in RAM to process faster. Linking to Rtvslo news is scaled down to about 20.000 articles. They’re linked to every paragraph.

Q: will the paragraphing heuristicts change
A: no, probably not.

Q: how much information is already in the subtitles? timestamps?
A: all timestamps are based on the subtitles, could be done without in the future.

Q: did the subtitling process change?
A: no

Q: what about the future? will there be voice recognition?
A: focus is on smart processing, there might be voice recognition to define stuff.

Q: all automatic?
A: yes. no manual changing is done currently.

Q: will there be?
A: it’s possible. maybe in the future the journalists will change the data.

Q: how much time was spent and what was the plan?
A: experimental from october 2006. rewritten for production once the prototype was done. the service gradually got bigger as it was developed.

Q: how about the pictures?
A: tricks. we know the beginning and the end of the paragraph. the interval usually contains about 5000 frames. which one to take? you take an image with a certain JPEG size – smaller are too blurry, bigger are also not good.

Q: is all text online?
A: everything is in the show except the prebuild subtitles.

Q: congrats. web has video with no subtitles. what are you planning to do in the future since you’re in a niche market?
A: subtitling of videos is expensive so we’re not going there. we’ll be in the niched markets – blogging, finding already tagged content,..

And we’re done.

Update #1:
As the service already launched it’s available at http://www.rtvslo.si/odprtikop.

Carwash

May 23rd, 2007

Washing cars seems to be very popular among popular people lately. My car usually needs a fresh car wash so when you’re making a video about it please contact me.