Thursday, March 29, 2007

Why Open source works for Me :)

I enjoy tinkering open source apps and find ways how to maximize it for future projects in mind. The exciting thing about it is when people use this apps and made some twist on its implementation. That's how I planned with our project, I use the open source MediaWiki to deploy this project. But to remind you, I have mentioned in my previous post that MediaWiki doesn't mean Wikipedia. Project Wikipiniana would start just like Wikipedia but eventually it would transform into something that Wikipedia doesn't aim to be. Wikipiniana in the future will mashup the entire Philippine cyberspace that would become the biggest portal every made for the Filipinos. In other words ...it would grow more than just your typical encyclopedia. (am i dreaming).

Pardon me but that's how I see it in one or two years but still I don't know if this plan would evolve into something that Pinoy community would love or hate. Nevertheless, this project is intended for them and build for the Filipinos and nothing more.

Anyway, back to the open source topic. What I love in the open source sphere are CMS'. These tiny applications are powerful that It can transform your imaginations into reality. It can be used in prototyping e-commerce websites, personal sites or just a platform for a mass distribution of a cause or advocacy. These apps are being developed by a community of developers that I can described as "generous geek". They find time to write codes and develop apps for the masses and these people becomes my idol because they made a difference. They have dedicated their effort to create a solution to speed up the dissemination of information freely. These people speak by the code and read by the code.


I have started with Plone during the early stage of my digital library project, then I go with Wordpress for our blogging needs and simple website and the sudden turn of events and requirement for a speedy project I have chosen MediaWiki. And that's what I am tinkering about.



Open source apps are easy to deploy as long you are patient in reading manuals and detailed instructions or else it would ruin your work. It can be deployed in minutes and it also run in an open source web server (Apache), Database (MySQL). Then after that, re-engineering work follows.
I am not a hardcore developer because I never write codes for my project yet . I can say that I am just a code pirate that only integrates existing codes available and freely distributed over the internet and mash it up to my specifications. In other words, I just tweak some codes that I understood and patch the codes then try how it works.

Patience and lots of patience are needed if you want to drive in an open source development project. But one more thing you must bear in mind. Open source has its limitation because they are build with a specific task to do. So, its your call if you decided to shop for an open source apps. Just make sure it fits your needs not just this time but one or two years or else, if you go beyond your initial purpose, you could be trapped. If you don't know how to reinvent the wheel, you will remain freeze until an updates has been released. But if you are concise with your purpose, open source is the way to go.

Monday, March 19, 2007

Wikipiniana and the the state of my Technophobic environment


With the start of this project and thru contact works, Wikipiniana project pushes through with much excitements. The initial outcome of articles being generated left and right created a headache to our in-house editors. They are in limbo in organizing these articles not because they are overwhelm but the fact that they don't have the exact grasp of the wiki technology.

I have laid down the platform, demonstrated its basic functions and yet, its like we are not on same page. Useless meetings so to speak, but I never gave up.

To work with editors who are trained with paper and pens is my strongest fear even the during the birth of my first project (Filipiniana.net). I am spending sleepless night trying to figure out how they can appreciate working online. But what can I do, we are suppressed by the traditional mode of publishing. Where scopes or outlines are being generated in layers of pages. Technophobia is what we call it. But this is something that is inevitable and everyone should adapt with new technology or be forgotten in the next generation.

These project, is for the young ones.. the Generation Y so to speak. These are the generation where since birth were already introduced with words such as star trek, vcd, cellphones and at school are familiar with email and internet. The young generations were the movers and the target of this project and I am expecting to have someone like them to work with. I am just so lucky that we have an IT consultant that understand my little crazy ideas. Plus my new boss that is really brilliant, techie and can read my mind :)

Ever since, I am fascinated with technology. Not because I was amazed by it but how it directly affects my life at work. Technology eases my work flow and made me more productive. This is something that seldom been recognized by your traditional bosses. I remember one line that has been thrown to me while I am mastering my Photoshop skills during lunch break.
"Why don't you take a nap rather than tinkering things on your computer, You're just wasting electricity and company's resources."
That words encourages me to strive more and let them realized that they were wrong.

And now, here I am again, trying to win their cooperation and understanding to support the project. A project that should be well explained in a layman's term to made them understand what I am talking about. May be not until, It make headlines that can change their minds and let them realized how important this project was. This is an initial initiative to regain consciousness that this company that I have served for a decade can have something worthwhile to share for the Filipino people.

Saturday, March 17, 2007

WikiPiniana Begins


Last March 15, the debates ended and was given the go signal to proceed to our crazy ideas (me and gus understand that). I have been given the go signal to proceed with the content generation for Wikipiniana. I have operated an outsourcing company composed of 100 people who don't have an account yet. Meaning they are buffers for future accounts. Well, its good that the company realized that instead these people have nothing to do while being paid, they have something to do to keep them busy. At least with this project they will be trained to use the internet for writing and be familiar with the birth of a new wiki in the Philippines. :)

There is a resistance during the organization of the teams but eventually got their support when I have explained to them the real purpose of the project and its advocacy. More or less, my team are those who volunteer for the project and have passion to do it. The 100 people becomes 50 and was divided into several topic categories that we prioritized.

I have been researching on how to do "bots" script for auto downloading of contents. If this plan materialized earlier, may be I don't need these people. But this is a blessing in disguise since I have realized that in order to make this project a success, you need real community to make it works and not by "robots".

There are around 15,000 Philippine-related articles inside Wikipedia and that's our target. Aggregate them by categories and supply it with the editorial team's submitted category prioritization then dunk it to Wikipiniana. From the initial 275 articles that I have created inside Wikipiniana which includes the templates, rules and guidelines, I am expecting to reach 10,000 articles in 3 months.

Well, I just hope and cross my fingers

Thursday, March 15, 2007

The Basics of User Management

This time, I want to discuss the user management inside a wiki site. Though you can also find this documentation in the Mediawiki site, I would like to replicate it and discuss a bit on how we implement it inside project Wikipiniana.

User levels

Sysop
This is the most privileged user. A user marked as 'sysop' can delete and undelete pages, protect and unprotect pages, block and unblock IPs, issue read-only SQL queries to the database, and use a shortcut revert-to-previous-contributor's-revision feature in contribs. They can also be referred as "admin" since they have privilege to do anything inside the wiki sandbox but limited to change the design aesthetics of the skins. In Wikipiniana project, a promoted sysops will have the following routine:

  • patrol pages - they will be allowed to delete all unnecessary contents. Meaning, articles without substance that contains like a spam and adverts.
  • create project - as given privilege as such, they can create a project such as portals or sub portals.
  • participate in page updates - as sysops, they are allowed to participate in frontpage planning. They can update any contents under protected pages.

Developer

This is largely obsolete and will be removed from future versions of the software.
Developer has special rights and sees additional features in the Special-Pages (lock / unlock DB) as well in setting User-rights. Only a developer can UN-Set (delete) the Sysop-Rights of an admin. We never use this in Wikipiniana

Bureaucrat
This is a user that is allowed to turn other users into sysops via the aforementioned Special:User rights page. At this time, It's only me that have accessed to this as admin. When time comes that the project grows, I have to turn this over to those who are diligent and promising contributors inside Wikipiniana.

Bot
A registered bot account. Edits by an account with this set will not appear by default in Recent changes; this is intended for mass imports of data without flooding human edits from view. (To show bot edits, either click the "Show bots" link on the Special:Recentchanges page, or append &hidebots=0 directly to the page URL, e.g. like this on Wikipedia, or like this on Meta.)

Further discussion of admin please see
http://meta.wikimedia.org/wiki/Help:Administration

Thursday, March 8, 2007

Combating Spams on Mediawiki




One of the assassins that are proliferating over the internet are "spams". It does not only invade our regular emails but also websites. Many webmasters of any social networking sites or any community based websites have experienced that attack of a robot-generated or human spamming. Now, that wiki technology is getting popular as an open site, it is a good contender of an attack. Well, to combat spam or protect my wiki project, I have to come up with a solution.

After thorough research and testing all types of anti-spam mechanism, I have found "Completely Automated Public Turing test to tell Computers and Humans Apart" (CAPTCHA), a type of challenge-response test used in computing to determine whether the user is human. Now here's how apply it inside mediawiki.

Checking article text for spam

Since Wikipiniana is a text-based collaborative writing, insertion of unfiltered words (pornographic or offensive statements) is an avoidable circumstances wherein anybody who has nothing to do in this world can ruin your good project. By filtering possible words in your contents, it can ease your trouble in sanitizing contents.


a. $wgSpamRegex

- open your LocalSettings.php the add the following code:

$wgSpamRegex = "/".
"s-e-x|livesex|animalsex|". //These match spam words.
"dirare\.com|". //This matches a spammer's domain name
"overflow:\s*auto|". //This matches against overflow:auto
"height:\s*[0-4]px|". //This matches against height:0px (most CSS hidden spam)
"\<\s*a\s*href|". //This blocks
"display\s*:\s*none". //This matches against display:none
"/i"; //This ignores upper-lower case for letters.

(*note: populate $wgSpmRegex as desired)

- for Syntax reference pls see Regular_Expression
- Expected output: when you try to add an article entitled "livesex" or an article that contains the word "livesex", the wiki will not save the article

b. SpamBlacklist (will automatically load set of spam list from wiki; filters only text included in external link and follow direction)

* Download SpamBlacklist.php and SpamBlacklist_body.php from http://svn.wikimedia.org/svnroot/mediawiki/trunk/extensions/SpamBlacklist/

* Save to /mywiki/extensions/SpamBlacklist/

* Open LocalSettings.php from /mywiki/ folder and add the ff. code:

require_once( "$IP/extensions/SpamBlacklist/ SpamBlacklist.php" );

c. $wgSpamBlacklistFiles (comes after SpamBlacklist [purpose : loads list of text filters as specified rather than using the set of spam automatically loaded from wiki])

* open LocalSettings.php from /mywiki/ folder.
* add the following code:

#right after require_once("$IP/extensions/SpamBlacklist/ SpamBlacklist.php" );

$wgSpamBlacklistFiles = array ( “specify URL of blacklist”);


#example of URL “$IP/extensions/SpamBlacklist/mywiki_blacklist.php

2. Identify if text input is from a human or a spam bot. (CAPTCHA images)

d. Download ConfirmEdit.php and ConfirmEdit.i18n.php (latest update) from http://svn.wikimedia.org/viewvc/mediawiki/trunk/ extensions/ConfirmEdit/ and save to the directory: /mywiki/extensions/ConfirmEdit. If ConfirmEdit folder does not exist create the folder.

e. (optional )Open ConfirmEdit.php and customize $wgCaptchaTriggers,

$ceAllowConfirmedEmail, and $wgCaptchaWhitelist as desired.

f. Open LocalSettings.php in /mywiki/ folder.

g. Add the following line in LocalSettings.php:

require_once( "$IP/extensions/ConfirmEdit/ConfirmEdit.php" );


3. To disable google from giving additional rank to spammer sites

h. Open DefaultSettings.php found under the folder /mywiki/includes
i. Find the $wgNoFollowLinks and change true to false

4. To secure proxy banning (***not advisable)

j. SORBS DNSBL (a support system)

(*note: CAPTHCA is a better alternative as discussed on http://meta.wikimedia.org/wiki/Proxy_blocking)

5. Also you must implement "User must be logged-in to perform edit"

k. Open LocalSettings.php in /filnetwiki/ folder

l. Add the following line in LocalSettings.php:

$wgGroupPermissions[‘*’][‘edit’] = false;
$wgShowIPinHeader = false;


Now, I have my peace of mind...