Phine Solutions web work notes

ios static library

Filed under: my 2 cents by 1.618 — January 13, 2013 10:30 am

Using xcode, it’s pretty easy to create a static library. However, when there is resource involved, things can be a bit complicated. Here is a good tutorial on how to create a static library with core data support in an iPhone app.

Adding SSL to Apache – connection error

Filed under: server setup by 1.618 — February 12, 2012 10:20 am

Here is the background of the problem:

Recently I added SSL certificate (purchased from a CA) to my Apache server. After making configuration changes, opening the port and restarting the server, the https connection still won’t work. From Chrome, I got “Error 107 (net::ERR_SSL_PROTOCOL_ERROR): SSL protocol error.”; and from FireFox, simply “connection interrupted”. Neither search term really offer much help. However one tips really helped is to check http on port 443 directly.

That worked. So in my case, it’s almost like Apache does not recognize SSL request. A new round of checking configuration and certificate files ensued.

I did find the fix in the end. In my Apache virtual host files, I have virtual host entries defined like this:

<VirtualHost SERVER_IP>

I had to change it to <VirtualHost SERVER_IP:80>

It’s probably because Apache matched the SSL request to a non-SSL virtual host and tried to serve it. So if you are adding SSL to the mix, remember to check the previously defined virtual hosts on port 80 and tighten up the rule.

An unlikely place to look for an Amazon S3 issue

Filed under: PHP development by 1.618 — August 28, 2011 1:59 pm

This has driven me nuts for a few hours so I have to share just in case someone else has similar issue.

Basically I have a PHP script to access Amazon S3 using its SDK for PHP lib, and it runs from a server that I recently built (CentOS). The problem is, the script won’t work. Not that it threw out an error, it just wouldn’t return the result. For example if I was trying to get a list of files with same prefix, it returned an empty array. And it worked every where else and I knew for sure the stuff it’s looking for is there.

I looked everywhere and couldn’t figure out why. Until finally, I noticed that the system time is off by a few hours. It was a whole new story to fix that and this article should explain all you need to know. But in the end, after the system time is corrected, the script works again.

So if you have a script which consumes a web service, make sure the host’s system time is set correctly. It might save you a few hours of a Sunday afternoon.

A PHP class to read Microsoft Access Database

Filed under: PHP development by 1.618 — February 6, 2011 10:35 am

A friend of mine recently asked me to help out building a data warehouse based on some Access database files. Here is a PHP class that I created to read database record from Access.

class DataMdb {
	private $conn;

	function __construct($mdbFile) {
		// Set up the connection
		if (!$this->conn = new COM("ADODB.Connection")) {
			exit("Unable to create an ADODB connection");
		}
		$strConn = "DRIVER={Microsoft Access Driver (*.mdb)}; DBQ=" . realpath($mdbFile);
		try {
			$this->conn->open($strConn);
		} catch (Exception $e) {
			exit("Caught exception: " . $e->getMessage());
		}
	}

	function getAll($sql) {
		$result = array();
		try {
			$rs = $this->conn->execute($sql);

			if (!$rs->EOF) {
				$rs->MoveFirst();
				$fieldCnt = $rs->Fields->Count();

				while (!$rs->EOF) {
					$row = array();
					for ($i=0; $i<$fieldCnt; $i++) {
						$row[$rs->Fields($i)->name] = $rs->Fields($i)->value;
					}
					$result[] = $row;
					$rs->MoveNext();
				}
			}
			$rs->close();

		} catch (Exception $e) {
			exit("Caught exception: " . $e->getTraceAsString());
		}

		return $result;
	}

	function disconnect() {
		$this->conn->close();
	}
}

Using Sphinx as a site search engine

Filed under: tools by 1.618 — August 8, 2010 6:17 pm

A website, especially a content oriented one, needs good search functionality. This can be implemented locally, or outsourced to a search engine like Google. The former obviously requires a lot of work in database design and coding; and the later, relies on Google to guess what you have and what your users are looking for. And in the case of a site information behind a “walled garden”, it becomes impossible and insecure to let Google crawling and indexing the protected content.

Sphinx is a search server that I believe provides better approach to overcome the prior issues. It handles the indexing by reading into database (or files), and provides full text search capability via standard APIs.

Overall Sphinx is pretty easy to set up. Installation on a Linux server requires downloading the source code and going through the usual “make install” process. If you have done the installation from source before this should be easy. After installing the software, you also need to create a Sphinx configuration file to get Sphinx work in your environment. This is when I ran into some issues and I’ll share some of this experience in the rest of the post.

Basically Sphinx adds two processes to your server: indexer and searchd service. The Indexer is a process which should be kick off periodically, depending on the frequency you wish, to index the data (mostly) from a database; Searchd, by the name of it, is a daemon which listens on a port and handles the search request. The Indexer can be controlled by crontab, and Searchd should run as a service and configured properly so in the event of server reboot, it will be started automatically.

Adding a new service startup script in in a Linux environment requires creating a new shell script and putting it in somewhere under etc directory. Here is a sample script that I have for a CentOS server (it’s was quickly put together so it’s pretty rough on the edges):

#!/bin/bash
##
# chkconfig: 345 55 35
# description: Sphinx search daemon
#

case “$1″ in
start)
echo -n “Starting Sphinx searchd:”
sudo -u myuser /usr/local/bin/searchd –config /home/myuser/sphinx/sphinx.conf >> /dev/null 2>&1
echo
;;
stop)
echo -n “Stopping Sphinx searchd:”
/usr/local/bin/searchd –config /home/myuser/sphinx/sphinx.conf –stop >> /dev/null 2>&1
echo
;;
restart)
$0 stop
$0 start
;;
*)
echo “usage: $0 [start|stop|restart]”
esac
exit 0

Notice that I added sudo command so Searchd runs under “myuser”. This is because having indexer and searchd run under different users can post some issues.

At this point I want to go over my setup a little bit. The server that I have Sphinx installed hosts several websites. A couple of user are created with different sites deployed under their home directories respectively. Since I plan to use Sphinx on only one of the site, I want the site owner “myuser” owns the indexer process, and keep the Sphinx data and log files locally, somewhere under myuser’s home directory. In this particular setup, if the searchd service runs by root, I ran into permission issues.

First, searchd creates a *.spl file, which myuser doesn’t have the read permission on. The indexer produces the following error even the rotate option IS indeed presented:

indexing index ‘mydatabase_search’…
FATAL: failed to open /home/myuser/sphinx/data/mydatabase.spl: Permission denied, will not index. Try –rotate option.

Another issue is the ownership of the searchd.pid file. The indexer complains again if it can’t read it:

WARNING: failed to open pid_file ‘…/searchd.pid’.
WARNING: indices NOT rotated.

If you wonder why the indexer needs the access to these files, it is because whenever the indexer is running, it notifies searchd by sending a SIGHUP signal.

Now these issues can be bypassed by changing the permission of these files in the searchd startup script. But in the end I think using sudo command is a cleaner solution. And ultimately, all Sphinx related files, including configuration and process pid, are stored locally and can be accessed easily by the “site owner”. The only drawback in this approach I can see now, is when Sphinx is added to another site owned by a different user, there needs to be a searchd process for each site owner.

Since I’m still experimenting this particular setup may not be the best solution. But hopefully the post can shed some light on certain issues that other people might run into.

Only a developer can understand this joke

Filed under: my 2 cents by 1.618 — June 30, 2010 11:23 am

Sending email using Google App and PHP Swift Mailer

Filed under: my 2 cents by 1.618 — June 23, 2010 3:19 pm

Not very long ago I converted one of my site to use Google App email service. Using a third party email service can reduce the load on your own server and eliminate the responsibilities of configuring and maintaining a mail server. Since it’s essentially Gmail, SMTP is supported. I paired it with Swift Mailer, the free PHP mail client and the solution has been quite stable and satisfactory.

Until out of blue I checked the mailbox of the default mailing account: noreply@mydomain.com.

The mailbox is filled with undeliverable emails (which is normal), and surprisingly, a quite amount of user emails. Here is the scenario: my website has an online form that supports one to send message to an email address. So if abc@domain1.com uses the form to send message to xyx@domain2.com, an mail will be delivered to xyz@domain2.com, through Gmail using noreply@mydomain.com. Also to achieve better user experience, when Swift Mailer message is constructed, I also set the “from” address to abc@domain1.com, so when xyz@domain2.com receives the email, the message appears to be from abc@domain1.com directly. The idea is also that when xyz replies, the message goes back to abc.

The problem is in the replying part. You might have guessed or figured out, all the replies went back to noreply@mydomain.com. So basically all responses were lost since no one care about the noreply mailbox.

I looked more carefully into the email header and found the problem: the from address looks up like this: abc@domain1.com<noreply@mydomain.com>.

As you can see abc@domain1.com is only treated as a “display name” and actual email address is still noreply@mydomain.com. So even the from address appears to be correct in a email client, replying to the message sends the response to a black hole.

After identifying the problem I started to tweak the Swift Mailer message but no matter what I try Gmail will always append the noreply address in the from header. This left me in a despair mood. Other options are not good. I can write a program to automatically check the mailbox and forward those emails, or modify the message with some warning to notify the receiver not to reply directly, or just switch to a new email service. All these options are either too complicated (without a good cause), or will negatively affect user experience.

After some more poking around I found this email header option: “Reply-To”. If it does what its name indicates my problem can be solved by adding this header in mail messages. You might be laughing at me right now for not knowing this but before this I had never really studied email headers and I did get a load of information by trying to solve this.

Adding a header field is pretty easy in Swift Mailer and it worked. With the “Reply-To” header the mail client (and webmails that I tested) correctly put the right email address when replying.

It’s always fun to be able to solve a problem in a simple way, before going too far down the other paths. And hopefully this post can help someone else in the same boat.

Plug those Javascript memory leaking holes

Filed under: my 2 cents by 1.618 — April 29, 2010 9:55 am

Recently I built a Javascript app that uses Gmap API, Ajax and JQuery to display the location information on a web page. It retrieves information from server using Ajax and the returned json object, to display location information for certain things like business, hotel and schools on a Google map. The project is fun to work on and I learned a good lessen on Javascript memory leak as well.

As I worked on testing and development the code in my browser, I realized the browser was getting less and less responsive. So I opened up the task manager to find out what’s going on and found out that the browser’s memory usage has been continuing growing every time I refreshed the page, and the usage wouldn’t come back. And at one point, Firefox is hogging over 600M or memory on my system. And the bad behavior is consistent across Internet Explorer and Firefox. This means if a user use the page for some extensive time he’ll have to shutdown the browser to re-claim the system memory.

Not a very good user experience. Being a amateur Javascript developer I don’t have a lot of insight on how the memory is managed in Javascript. As a matter of fact, it has been largely ignored in my patty little scripts before. This time I have to take it seriously. So I hit up the web to see what could have caused the Javascript memory leak.

What could cause memory leaks

When we talk about Javascript memory leak, we often refer to the behavior that memory is not properly de-allocated when a web page is closed or refreshed. Web browsers have the build-in garbage collectors to collect and release the memory when it’s not used anymore. Usually when a page is done, the browser will remove everything created by Javascript, unless, however there are connections between Javascript objects (variables and functions) and DOM elements.

Another situation should probably be better addressed as memory management. A good Ajax app lets use to access different information while staying on the same page and in this case, the app can also eat up memory and become sluggish over the time. Although most of the time we might not notice this but a poorly written app can ramp up memory usage quickly and cause some bad user experience.

In theory, what should be avoided

The two biggest culprits that will cause memory leak are “Circular reference” and “closure”. Other people have written good articles about them. This article illustrates the problems in a relatively simple explanation. “Circular reference” is basically what I would call it “spaghetti code” in Javascript. The modern Object Oriented languages have helped reduce this kind of code, but in Javascript, it can easily happen. A simple example is to have a DOM element refers to a Javascript function (using onclick event call for example), and the Javascript function refers to the DOM element. If you used inner function in Javascript, you know what closure is. It is convenient in terms of inheriting the variables but it is also an easy way to hide the circular reference.

These two things probably are the source of most of the memory leaking problems. However, theories aside, I found it hard to spot the case in real code. Most of the time the code don’t really resemble the evil demos. The best way to avoid a problem is to stay away from it. So I come up with some good practices here that I believe which will be helpful to avoid problems, or at lease, minimize the risk to the lowest level.

Watch the variables

There are two type of variables in terms of the scope. One is declared outside a function, with or without the var keyword; the other is declared inside a function, WITH the var keyword. The later obviously is supposed to live a shorter life and only be used in the function scope. Although the first type is often referred as “global variable”, it is not necessarily the same as the global variable in other languages because it’s only supposed to live in the lifetime of a page.

The key is “supposed to”. Since Javascript runs in browsers, it becomes browser’s responsibility to release the memory allocated to variables when they are not in use. To judge whether a variable is “in use”, browser checks if there is any other reference to this object. If there is, browser will not de-allocate it. As more activities happening the browser will continue to take but not release any memory.

Because they don’t always do. Here is an scenario when “closure” is used in Javascript code. One common example for closure is using a function inside a function. But there is more. To understand more about closure, this is a great article.

But closure can easily cause memory leak if used like this:

function localDocElem() {

var elem = document.getElementById(‘ClickElem’);

elem.bind(‘click’, function() { elem.innerHTML = ‘<span class=”highlight”>’ + elem.innerHTML + ‘</span’; });
}

This function simply attach a click event to a DOM element so whenever it is clicked the content gets “highlighted”. However, the function attached to the DOM object “elem” also uses the DOM object, hence, a circular reference is created between the DOM object and function object (in Javascript, even a function is maintained as an object) so the browser won’t garbage collect the local variable elem even after the localDocElem function returns. In this particular example, since a DOM object is connected with Javascript function, it is possible that the browser does not garbage collect either even the user leaves the page and the only way to claim the memory back is to shutdown the browser completely.

Consider using global variable to hold large amount of data can reduce some of the risk above. Let’s you have a Javascript that uses Ajax to to pull data from the server constantly, you can use JQuery’s $.getJson method to archive this. The returned data is better to be held in a global variable than local.

Un-allocate and nullify often

In a map application, lots of objects like icons, markers, etc. are created dynamically. They take memory to store and they can take up large chunk of memory quickly. It is a better practice to “nullify” them whenever they are not in use, for example, when displaying a completely different set of markers in a map view. Usually we can just assign the null value to the variable to achieve this, however for a clean de-allocation you’ll need to make sure to remove anything that is attached to them. For example, events, callback functions.

Remove listeners

To make an app interactive to user activities we create event listeners. Take a Gmap app example, we add click listener to an icon so when it is clicked, more detailed information is show. In some other cases, you could add a click event listener on a div element so whenever it is clicked, the background color changes. Since listeners and the function they trigger are attached to an object, you can see this is another easy case to get into the circular reference loop. In my code, a lot of listeners are created on the fly for those Gmap “markers”.

What I found helpful was to promptly remove the listeners when the markers are refreshed (along with the code to nullify the marker variables).

Delete the dynamically generated DOM objects

Often we use Javascript to create DOM elements, like “populating” a div tag on the fly. Since some browsers manage the DOM objects and script objects differently, the circular reference scenario can be easily created like discussed above. When you are done with the content and move on to something else, it might be helpful to eliminate the DOM element completely. For example, if you have update the content of a Div tag, you can empty out its inner HTML like this:

document.getElementById(‘MyAjaxTarget’).innerHTML = ”

Create cleanup methods for different stages

GUnload is provided as default Gmap cleanup call when the page is unload. However it is probably not enough since it is only called when a use leaves or refreshes a page. What we need to make sure is that memory is always released promptly when not in use. So I created several “cleanup” methods just to do that. The cleanup methods can include everything talked above, and they are called whenever needed in the code flow.

In summary, while Javascript enables us to enrich the user experience it can also do just the opposite if not done correctly. Although the modern browsers are doing much better jobs today to manage the Javascript garbage collection, we still need to do our due diligence to develop the code that are not only fancy, but also “responsible” to our users.

Here is the link of my Google map app if you are interested.

If you have a good tip to help better manage memory usage in a Javascript, please share your comment!

When do we need “smooth scrolling”

Filed under: usability design by 1.618 — April 5, 2010 9:58 am

Recently I discovered this little javascript hidden gem: smooth scrolling to an anchor link on the same page.

Basically it creates smooth scrolling effect when a user clicks on an anchor tag that points to a different part of a page, mostly likely a very long page. Often times, in this kind of situation, the page jumps so quick that I have to look for the URL in the browser’s address bar to see if I’m still on the same page. And for a lot of users, this can be confusing sometimes. This little script, once included in the “head” tag, will make this kind of scrolling go slower. Although It’s still a pretty fast scrolling motion, it slows down enough to make user aware that he is being redirected to a different section on the same page.

Another benefit of the approach is that the scrolling subtly reveals the other part of the page as well, which is a nice way to encourage user to explore other part of the page.

Typically I’m not a big fan of implementing fancy but less practical effects on a web page using javascripts. They do have a cost to load and for most of the users, visual effects wear out after the initial “oohs and ahs”. However, this one is a great implementation to address an issue that is too trivial for browser but valuable for web designers.

Build your own Linux VPS

Filed under: server setup by 1.618 — February 26, 2010 11:57 am

5 years ago if you ask me to build my own Linux VPS for my websites I would’ve shaken my head and said it was too much for a non-sysadmin like me. Now I’m pretty comfortable of doing it. I want to share my thoughts in this post and hopefully it can be useful to other web builders.

When I first started I put my websites on shared hosting servers. While it was cheap and easy to setup you often don’t get the best performance on your dollar. This is especially true when your site gets more traffic and you are anal about the downtime like me. Until I discovered VPS. VPS is a great solution to move up from shared hosting. Having a VPS server means you have full control over a virtual host so you can install and configure the way you like; also, because using VPS means “renting” a slice of a physical server, someone else takes care about the racking, networking and hardware maintenance.

There are generally two types of VPS. One is “managed” and the other is obviously, “un-managed”. “managed” in this case means the service provider will help you install apps, trouble shoot, and in some cases, walk you step by step to help you resolve an issue, as long as you ask. And often times you have a full web based system control panel, like “CPanel” installed for you. The later case, apparently, don’t have this kind of service. You are basically given a barebone server and you are on your own. As you can see a managed service will be more costly.

I started with managed VPS. But as my Linux skill gets better the “managed part” of the service becomes less and less necessary. CPanel is a great tool but it also a big resource consumer itself. Sometimes you might find most of your system resource is consumed by addons, not the main apps like web server or the database.

To grow out of a managed service the key is to try and learn. for example, if you choose to stay on CPanel forever (not that there is anything wrong with it:)), you’ll stay on it for ever. To take the leap you need to be ready for it. It took me sometime to get to the comfort level that I’m at now. During which, I found a number of guidelines that I begin to follow.

Keep a good and updated server diary

I think this is the first thing to do when you start handle your own server. A good server diary can not only help you trouble shoot, it’s also a good reference when you need to re-install, upgrade, or in some less frequent cases, move to new service provider.

Like any other service you buy, hosting service provider’s service quality can go down too. You pretty much don’t have any other choices except voting with your feet. A good service log can make changing host a lot easier. Recently when I switched my VPS provide, it only took me a few hours to stand up full services on a brand new VPS host. The process also forced me to refresh and update my server diary.

Utilize the external services

To host a website on VPS, we have to install pretty much everything ourselves. They include at least Apache web server, PHP and MySQL. However, besides the basic LAMP stack, we also need to take care of services like DNS, email service. To keep your admin work as simple as possible, I strongly recommend outsourcing DNS or email to external service provider. For DNS, there are lots of options. I use dnsmadeeasy.com. For only a few dallors a month, you are completely shielded from managing your DNS app. Although you still need to understand what a “cname” is and how to change the DNS record to point your web site to the correct ip, you have a much much smaller learning curve.

Email is another service that can get quite complex. I found using Google App’s email service makes a lot of sense. Since their email service is based on Gmail, the IMAP , effective Spam filter and web access are all included naturally. Without the email server and SmapAssasin taking up resource on your server, your server is also better optimized. Google Apps has both free and paid version. Another benefit of using reputable external email service is the trust level you gain for your emails. A lot of my users who use Yahoo mail couldn’t receive message because they were marked spam. But since I started using Gmail service it has been a lot better.

If your web site has a lot of user generated content like photos, you may also consider using the cloud storage like Amazon AWS. I’m generally against building your site completely depending on the cloud but it another subject.

Install from source

I know this is quite a debatable subject. Installing application from source doesn’t always give you the type of control that you can have using the packaging tools. However, there are several benefit that can’t be overlooked.

First, you have the full control on the binary and you can build the exact binary that you want. For example, when building your own PHP binary, you can specify the features you want to enable to have a small footprint. Same can be applied to Apache httpd server. This will directly impact the memory usage of your web server.

Secondly if you are accustomed to source installation you will not need to hunt around for the latest RPM or whatever installation package that built by someone else. You can stay updated with the latest version of software. Since the same procedure can be universally applied to all the Linux distros you are less likely to be affected by the different packaging tool that different Linux distro offers.

And lastly, it’s really not that hard to do.

Some basic steps

With a brand new VPS, there are some basic setups that have to be run to ensure the security and basic usability. Your VPS service provider will configure your VPS to a certain degree before handing over so you might need to look into the system configuration like partitioning before proceeding to the steps below.

Update system information

When your new server is up and running, you’ll need to update the host name:
echo “mynewserver.com” > /etc/hostname
hostname -F /etc/hostname
And don’t forget updating the hostname variable in /etc/sysconfig/network file
Also update the system time:
ln -sf /usr/share/zoneinfo/US/Eastern /etc/localtime

Create users

Adding users is the second must.

groupadd johndoe

useradd -d /home/johndoe -g johndoe -p johndoespassword

To create user “apache” for your web server you’ll need the following command:

groupadd apache
useradd apache -c “Apache Server” -d /dev/null -g apache -s /sbin/nologin

Turn off the unnecessary services

By default you’ll have some services up and running. You want to turn them off.
This is the command to show who is up:
chkconfig –list | grep 3:on
This is the command to turn one off
chkconfig <service name> off
Also, make sure double check who is listening on what port. On my servers, I only leave ports for sshd, httpd, mysqld and a few others.
netstat -an

Secure SSH

You want to turn off the root access:

/etc/ssh/sshd_config file, set this: PermitRootLogin no

Also you want to set up public/private key authentication.

I would also recommend changing the port from 22 to something else.

Set up a firewall using iptables

If you are reading this article you probably know what iptables is. The tricky part is how to configure it. A few years ago I went a great length to learn what tables and chains are and how to set up a shell script to configure an iptables firewall. The problem was I soon forgot what I had learned since configuring iptables is not something I do on a daily basis for a developer like myself. And guess what, I locked myself out on my first try in a new server.

Luckily there are tools today which wraps around iptables and expose an easy to use configuration interface. This makes the life a lot easier for me. APF is what I use and the project page can be found here: http://www.rfxn.com/projects/advanced-policy-firewall/.

Install some utilities, compiler and libraries

I only use CentOS/Redhat system as example and yum is my command of choice for packaging tool. Again they are just basic tools and libraries for installing Apache, PHP so others might need to be installed as well. But the key again, is to keep a good log of what has been installed so you have a good reference when you build your next server.

yum install man
yum install vixie-cron
yum install wget
yum install rsync
yum groupinstall ‘Development Tools’
yum install mailx
yum install zlib-devel
yum install openssl-devel
yum install libxml2-devel
yum install curl
yum install curl-devel
yum install libjpeg-devel
yum install libpng-devel
yum install mysql-devel
yum install libxslt-devel
yum install libmcrypt
yum install libmcrypt-devel
yum install libevent
yum install libevent-devel

Install applications

Now it comes the time to install your beloved apps. One thing to remember if you install from source is to create script in /etc/init.d and add the service entry. For example after installing Apache http server, you need to add the httpd startup/shutdown script to /etc/init.d and add it to your service list:

chkconfig –add httpd
chkconfig –levels 235 httpd on

Install a MTA

It’s likely that your web site needs to send emails so you probably need an MTA to talk to your external email server through SMTP for message delivery. If the system comes with Sendmail installed, I’ll go ahead use it. Here is a post I wrote to get sendmail work with gmail. There are other options like postfix, exim and qmail that you can consider. Here is a good article on MTA comparison. Although there are lots of pros and cons that you can munch on I think the most important thing to consider is which one is the easiest for you. With the latest development they are all very capable products so anyone of them can deliver your need.
This is quite a long-winded post. I don’t mean to write a tutorial but just want to cover some basics on building a Linux VPS (or dedicated server in this matter). A few years ago it was almost unthinkable for me that all these kind of things can be done by one person, but as the tools, technology and information become more available and easier to find, it is quite feasible now. I hope the post can be a good start for us web builders who are interested in setting up their own server. And please do leave your comment if you have any thoughts, tips or suggestions.
Next Page »

©phinesolutions.com