Behemoth: October 2011

Monday, October 31, 2011

Hacking Weebly: PowWeb's Drag & Drop Builder

Background

What led to the Weebly hack-a-thon?
I'm a big fan of PowWeb because of their unlimited bandwidth and storage plan which doesn't leave me guessing. Also the simplicity of additional software installation (powered by SimpleScripts) puts even the biggest Software-As-A-Service (SAAS) players in the market to shame.
When I was charged with hosting a website on PowWeb, I decided to give their wheel-of-fortune a big'ol whirl. In the beginning these were my top choices to try and build a website:
- Joomla
- Drupal
- ocPortal
- Xoops
With all four technologies at my disposal. I used each and every one of them with the goal of piecing together a simple & attractive website. This was a huge failure on so many levels with each and every one of them and a four hour time sink into each technology that left me quite sad. In the end I bit the bullet and succumbed using Weebly Drag & Drop Builder.
What had kept me from using Weebly in the first place?
- Popups that kept asking me to pay-up for any drag & drop component that was even remotely useful for building a meaningful website. For example: being able to embed a simple video.
- I thought that any of the free and fully-featured Content Management Systems (CMS) available on PowWeb would easily beat anything Weebly had to offer. This was not true.
What keeps my love & "ughhh" relationship with Weebly going?
- Love: The have a lot of great looking templates to offer. Really really nice stuff in my opinion when one is stating from scratch and couldn't even color inside the lines in a picture book with crayons to save one's life.
- Ughhh: Only very few of the templates follow the basic-design principles like:
  - Provide a way for the user to clearly & qucikly distinguish which page they are on.
    - Most of the templates had tab bars to navigate through the website but after clicking their was no indication of what was selected. There was no change in the text font, no change in the background color of the selected/clicked item of the navigation bar and there are no breadcrumbs anywhere.
  - Do not force the user to scroll vertically.
    - If there might not be any content that takes up the space, then what's the point of enforcing a fixed-height or min-height template?
  - Don't add banner images/placeholders that cause unnecessary scrolling.
    - I could not understand why the templates wouldn't allow the banner placeholders to be removed or resize them if a thinner banner image was provided.
- Ughhh: A prototype website cannot be built with the palm-open and finger-twiching-for-cash like-a-bellboy approach that their pop-ups have going.
- Love: Someone in the Weebly camp understood that it takes time for any developer to truly get behind a technology via trial & error PLUS convince all vested parties that it is worth forking over cash for ... so they threw in the "Custom HTML" widget in the free goodies bag. This allows hackers to put together a presentable website and then perhaps if their is value in everyone being able to edit the content easily, you take away the hacks, pay for the service, and use the simple drag & drop components that allow anyone to jump in and make edits without having the web-dev know how.

Lets Hack!

How can images be replaced if the Weebly template does not allow it?
- Publish the Weebly website as-is and use the "View Source" option in your browser.
- Search for the image in the source and if it is directly embedded in the html, then you can jot down the "id" of the "img" element and change the "src" attribute via a javascript call placed in the "Custom HTML" element. You'll have to add it to your pages in the Weebly editor.
- If instead of the image, their is a placeholder element like a "div" in the html then jot down the "id" of that "div" element as it is probably being styled via one of the Weebly template's CSS files. Once you locate the CSS responsible, you will know which parts you want to override yourself via javascript and a "Custom HTML" widget in the Weebly editor. Here's an example:
How can expandable/collapsible (+/-) areas be added?
- Add a "Custom HTML" widget to your page in the Weebly Editor. Add the following to load the jQuery javascript library without conflicting with other libraries like Prototype which the Weebly template may already be loading on its own:
- Place a div around the toggle-able section, give it an id, and hide it by default. Then use jQuery to show/hide the content when the user clicks on a button or text that you provide as the control.
```
Click me to control show/hide.
The div wrapped content will be shown/hidden at your whim.
```

Thursday, October 20, 2011

HTML5 vs. Native Mobile Apps

Technologies that are enabling HTML5 to either deploy to multiple mobile platforms or keep-up with the native look:

PhoneGap
Sencha Touch
jqTouch
jQuery Mobile
- There is a minor issue in iOS that doesn't properly set the width when changing orientations with these viewport settings, but this will hopefully be fixed a a future release.
- It's not currently possible to deep link to an anchor (index.html#foo) on a page.

Monday, October 17, 2011

ElasticSearch and CouchDB: Match made in heaven?

In ElasticSearch (ES):

each indexed document is given a version number. This version number can be supplemented with an external value (for example, if maintained in a database). To enable this functionality, version_type should be set to external.
Sounds nice, primed for CouchDB, right? But:

The value provided must be a numeric, long value greater than 0, and less than around 9.2e+18.
The was CouchDB does versioning, it isn't numeric because it appends two numbers to create a sequence/version, for example:
1-1234567890
So how is this handled in case of a CouchDB stream for ES?
How does ES facilitate the generation of an "_id" based on the data in the incoming document?

Does it allow to take the value of a field of that document? For example, there is a way to perform "_routing" via one of the incoming document's fields for distributed indexing across shards. So what about something for picking out the id?
Does it allow to concatenate values of multiple fields of that document?

Same question as the one above for CouchDB.
TBD

Sunday, October 9, 2011

Scalability Madness

	CouchDB	CouchDB-Lucene	CouchIO / CouchOne / CouchBase	Solr	Elastic Search	MongoDB	BigCouch
Full-Text Search	No	Yes	?	Yes	Yes	May Be with Photovoltaic	?
Distribute-able	Yes	?	?	Yes	Yes	?	?
Distribute-ed	?	?	?	No	Yes	?	?
Schema-less	Yes	?	?	?	Yes	?	?
Tools for Importing Data	?	?	?	Yes	?	?	?
Comments	?	Does it go Toe to Toe against all the features of Lucene exposed by Solr?	best for HTML5 dev?	it is distributable but not distributed. SolrCloud has very few features	Compass got punted to invent Elastic Search.	?	?

Wednesday, October 5, 2011

Splitting up large XML data files for use with DIH in Solr

It is ridiculously beneficial to split up XML files if you will be using Solr's Data Import Handler (DIH) to process the data. I personally saw an improvement from a speed of 166 entries/minute to 30860 entries/minute after splitting up all the large XML data files into an individual file for every entity that is to become a lucene document in Solr.

It was only on a whim but the script that allowed me to experiment with this and yield the desired results was found here:

awk '/<item>/{close("row"count".xml");count++}count{f="row"count".xml";print $0 > f}' *.xml

So if your file looks something like:


  
    Item 1
    Description 1
    ...
  
  ...
  
    Item 20000
    Description 20000
    ...

Then all the items from 1 to 19,999 will be divided up by this script into idividual files named row1.xml, row2.xml ... row19999.xml and look like:


  Item N
  Description N
  ...

But the last (20,000-th) item will have a trailing tag:

  <item>
    <title>Item 20000</title>
    <description>Description 20000</description>
    ...
  </item>
</items>

If you have processed 10 files, each with 20000 entries using the splitter command mentioned above ... then basically every 20000, 20000*2 ... 20000*10 numbered file will need to have the trailing tag deleted from it. To that end, the following script can be edited by providing the # of original files in the while loop's comparison statement:

#!/bin/sh
if [ $# -eq 0 ]
then
  echo "Error - Number missing form command line argument"
  echo "Syntax : $0 number"
  echo " Use to print multiplication table for given number"
exit 1
fi
n=$1
i=1
while [ $i -le 10 ]
do
  echo "sed -ibak '/items>/d' row`expr $i \* $n`.xml"
  sed -ibak '/items>/d' row`expr $i \* $n`.xml
  i=`expr $i + 1`
done

And then running the script by passing it the # of the last entry (20000-th):

./sanitize.sh 20000
sed -ibak '/items>/d' row20000.xml
sed -ibak '/items>/d' row40000.xml
sed -ibak '/items>/d' row60000.xml
sed -ibak '/items>/d' row80000.xml
sed -ibak '/items>/d' row100000.xml
sed -ibak '/items>/d' row120000.xml
sed -ibak '/items>/d' row140000.xml
sed -ibak '/items>/d' row160000.xml
sed -ibak '/items>/d' row180000.xml
sed -ibak '/items>/d' row200000.xml

Import Dynamic Fields from XML into Solr via DIH

Given an XML file that needs to be imported into Solr, you may often run into some uncommon data values that would be:

best grouped together under the banner of some dynamic field defined in schema.xml file,
with their mapping left up to the discretion of an admin tweaking the data-config.xml file, just before running DIH.

Off the cuff, one may be at a loss on how exactly to accomplish this ... and in doubt if it can even be done! You've seen this done for databases with Data Import Handler (DIH) but not with the XPath handler, or URL datasource, or File datasource for XML.

Well it can be done and here's an example:

Lets say your XML file looks something like this:


  
    hammer
    tough and durable
    heavy
    2 inches
  
  
    nail
    sharp and thin
    hazard
    1 inch

Now disposition, width, dangerLevel, height are pieces of data that you may not be able to plan ahead for, in your schema.xml file. So instead it makes sense to leave some wiggle room by defining somewhat predictable dynamic fields like so:
Keep in mind that you have some flexibility and responsibility here in terms of choosing the type of the dynamic field ahead of time.
Now when your customers or customer-facing-admins who are handling the data-import.xml file and will be looking to kick-off DIH against an XML file that they know best ... it will be quite an easy for them to come up with something like the following on the spot:
and still have an agreed upon well-oiled working index at the end of the day.

Monday, October 3, 2011

Embedding Videos in Joomla 1.7

Log in as the administrator
Hover over the Extensions drop-down and click on Extension Manager
In the Install from URL section, paste the URL pointing to a zip file for the AllVideos Joomla plugin.
- http://joomlaworks.googlecode.com/files/plg_jw_allvideos-v4.0_j1.5-1.7.zip
Once you see a notification on screen for a successful installation, click on the Manage tab
Locate the row that lists the AllVideos plugin and click on the red status icon in order to toggle it to enabled.
Hover over the Extensions drop-down and click on Plugin Manager
Locate the row that lists the AllVideos plugin and click on the title of the plugin itself.
Configure the plugin based on your needs.

Sunday, October 2, 2011

Screencast Toolset

Best toolset that I've found for working on the Mac with screencasts:

ScreenR
Jing for recording.
SimpleMovieX for merging.
1. The videos merged using this tool will not work as intended on Vimeo or YouTube. They will stop at the very first location that was stitched together.
Final Cut Pro for merging.
1. The videos merged using this tool can be seamlessly uploaded to top providers like YouTube and everything in the video works as intended. But the content may show up as Public by default! So make sure to secure your content afterwards.

Behemoth