Developing Web-based applications

Best practices

Use HTTPS for the login page to prevent eavesdropping; Once the user has been successfully authenticated, generate a session ID and send it to the client as a cookie, and go back to HTTP, for faster performance
Passwords and other sensitive data like credit card data should be saved encrypted in the database or in an include file outside the web server's docs directory
Use POST to send data to the server
Sanitize input data with eg. mysql_real_escape_string(), as hackers will try to send SQL, exceedingly-long variables, etc. Generally speaking, don't trust user input (SQL injection, cross-site scripting attacks, etc.)
On a subscription server, don't assume that the user will visit pages in a particular order, and check user acces in every page/controller using session ID
Remember that a user may access the application multiple times at the same time with the same user account, either through multiple windows of the browser on one computer, or through multiple computers
Do not save anything in cookies appart from a session ID, which can be encrypted for added security
Session ID's should be time-limited so that a client host left open should expire after X minutes of non-activity
To keep things tidy, use the same page/controller to display a form, check its input, compute, and send a reply
Do not use SELECT *: The order and position in which columns are returned does not remain the same if you add, move, or delete columns. A simple change to your table structure could cause your application to fail
Watch out for the back button
Anticipate that your site may become much more popular and might need to run on multiple web servers connecting to multiple DB servers with write requests handled by more than one DB server...

Links on security:

Why write web applications?

Ease of deployment since, if you pay attention not to use browser- and version-specific pages, a web application can be accessed from any host that has a web browser preinstalled
Any update to the code only needs to be performed on the server(s)
Widgets supported by browsers are not as sophisticated as those available, which in a way is a good thing because users only need to know how to click, hence reduced training and support calls
Make sure you separate business (code), presentation (the actual input/output as displayed by the web browser), and data access using templates
Frameworks like Enhydra, Midgard, or Zope provide tools to speed up development over plain languages like Perl or PHP. Multiple templating systems are also available for Perl, PHP, and Python. Read about the Struts framework to know a bit more about the MVC (Mode View Controller) model

Why not write web applications?

R.I.P. Browsers by Miss Rogue

What kind of architectures are there?

Server-side: CGI, embedded code in HTML pages (PHP, Perl, ASP, JSP, etc.), application server (Java servlets, Zope, etc.)
Client-side: embedded script (JavaScript, VBScript, AJAX), embedded binaries (Java applets, ActiveX components)

Which tool to use?

There are basically two kinds of web sites:

Those whose purpose hasn't changed from its original goal, ie. serve documents (except you should go from static to dynamic pages to make it a snap to display documents in a layout that is consistent throughout the site and easy to change, eg. adding a header + footer + navigation bar)
Those that are mostly web applications, eg. Hotmail.

For the former, dynamic pages executed by a standard Web server is fine, but for the latter, you should take a look at application servers: They usually have their own embedded web server, but on bigger sites, are best run behind a web server acting as front-end.

Application server

What is it?

compiles all the code for the application, and keeps it in RAM
the interpreter doesn't have to be started each time it's accessed
remains in memory at all times, which improves performance and makes it easier to maintain state information, ie. solving the statelessness of HTTP, and keep open connections to the database
URLs can be turned into calls to objects and methods, and hence, look a lot more familiar to developers who have no experience with web applications but plenty of experience with dedicated applications written in OOP
Is required for languages other than PHP when running with Apache, as they are just too slow compared to PHP

Why should I use it instead of writing code-filled pages?

Traditionnally, web applications were written by creating pages in the web server's htdocs directory that the browser would call, and that the interpreter would execute before returning the output. For example:

/index.php is called by the client
The web server detects that this page contains PHP code, and has it executed by the PHP interpreter
The PHP interpreter outputs data to the web server which is then passed on to the browser for rendition.

Without an application server, each page must be called and compiled each time it is called by the application, and state information is handled by saving a session ID and extra information into a database.

As a pratical example, page1.php shows a login/password page and a Next button which calls page2.php that will connect to a database to check the credentials. Problem is, a regular web server has no mecanism for pages to communicate, so there's no way for page1.php to send the login/password to page2.php. Various solutions have been found to solve this problem, but using an application server is a better solution if you have the choice of running your own web server.

More information

OWASP's top risks list

From the Open Web Application Security Project .

Invalidated parameters: Failure to validate information from a Web requests before these are used by a Web application. Attackers can use these flaws to attack backend systems through a Web application.
Broken access control: Restrictions on what authenticated users are allowed to do are often not properly enforced. Attacks use this to access other users' accounts, view sensitive files or run unauthorised functions.
Broken account and session management: Account credentials and session tokens left without proper protection, leading to the risk that crackers could assume victims' identities.
Cross-site scripting flaws: A modern classic - mistakes here mean Web applications can be used as a mechanism to steal session tokens, attack a local machine or spoof content.
Buffer overflows: Arguable the most common type of security risk (so why isn't it number one? Ed). Sloppy programming means applications fail to properly validate inputs - so maliciously constructed, malformed requests can crash a process and be used to inject hostile code into target machines.
Command injection flaws: If an attacker can embed malicious commands in parameters passed to external systems these may be executed on behalf of a web application, to unpleasant effect.
Error handling problems: If an attacker can cause errors which are improperly handled, all manner of mischief (information disclosure, system crashes etc.) might be possible.
Insecure use of cryptography: Web apps frequently use cryptography. If that's not coded properly, sensitive information won't be adequately protected.
Remote administration flaws: If remote Web admin tools are insecure then an attacker stands a chance of gaining full access to all aspects of a site.
Web and application server misconfiguration: Don't trust out of the box security

Going from dedicated apps to web apps

Forms

There are two ways to check input from a form, eg. checking that all the required fields were filled correctly:

The traditional way, which is to call a second page using the ACTION item in a FORM tag which is called after the use has it a submit button. This requires a round-trip for all the incorrect fields
The more advanced way, which is to embed some JavaScript code in the form, and call this code when the user clicks on the submit button, without letting the user upload the results until all the fields are correctly filled. Here's a unfinished, not-workable sample:

<form name="my_form" id="my_form"

onSubmit="

var ok=0;

for(var i=0;i<this.form.length;i++) {

if(this.form.elements[i].value=='') {

alert('Remplir le champ : '+this.form.elements[i].name);

this.form.elements[i].focus();

this.form.elements[i].select();

ok=1;

return

}

if(ok>0) {

alert('Envoi impossible\nCompleter le formulaire');

return false;

}

else return true;

}

</form>

Cookies

What are cookies and how can I use them in my scripts?

Sessions

What if you need to restart the server (either plain or persistent)? If user server-side sessions with no database, all information is lost...

Templates

What are templates and how can I use them?

Frameworks

Client-side code

Basic JavaScript

Since ActiveX and Java applet never really made it, the only game in town is JavaScript. Apart from some similar-looking syntax, JavaScript has nothing to do with Java, and especially, does not require a compiler to be installed on the client host for the downloaded code to run. Instead, JavaScript is clear-text code embedded in web pages which is run after it's been downloaded as part of the web page. Its main use is to build sexier interfaces (eg. trees, WYSIWYG editors, etc.), write event-driven code (eg. copying into a text box the value of the item that was selected in a combobox), and check the contents of a form and only upload the result once the user has typed entered parameters (relying on the standard <FORM ACTION=myscript.html"> translates into multiple round-trips between the server and the client until all the fields are correctly filled.)

AJAX

Cleaning up with AJAX by Christopher Harrison
Call SOAP Web services with AJAX, Part 1: Build the Web services client by James Snell

Server-side code

CGI

The Common Gateway Interface, or CGI, is a standard for external gateway programs to interface with information servers such as HTTP servers. The reason you should look at alternatives, is that a CGI call forks a new process (or launches a new thread) each time a script is called, offering terrible performance.

wincgi

Looks like a way to write CGI scripts in VB. http://docs.rinet.ru:8083/UCGI/ch14.htm

mod_cgi

mod_cgi (Dynamic Content with CGI) is the Apache module to run scripts through CGI, which is the slowest method to run scripts.

mod_fastcgi

mod_fastcgi (FastCGI: A High-Performance Web Server Interface) is a faster method by having a server constantly running serving dynamic content when requested by the web server; This increases performance and provide persistance.

mod_scgi

mod_scgi is yet a faster solution (mod_scgi is the Apache module, while the "scgi" package is the server part, written in Python.)

mod_pcgi and PCGI

mod_pcgi is a module, while PCGI is a CGI script (stuff built by the Zope team.)

mod_python

mod_python is an Apache module that embeds the Python interpreter within the server. With mod_python you can write web-based applications in Python that will run many times faster than traditional CGI and will have access to advanced features such as ability to retain database connections and other data between hits and access to Apache internals.

Other Apache modules to run Python scripts are mod_snake and PyApache (which seems to be no longer developed), but mod_python seems the best choice of the three.

All of those modules are just ways to connect Apache with Python scripts, but are not frameworks in themselves.

The current main differences between mod_snake and mod_python are:

mod_snake works in both Apache 1.3 and 2.0 series. Mod_Python currently only runs in Apache 1.3.
mod_python runs under Windows -- mod_snake does not.
mod_snake allows Python written modules to create their own configuration directives, and gives the modules more power similar to that of C-style Apache modules (via per-dir configs, etc.) mod_python only allows for mod_perl style callbacks.

ISAPI, NSAPI

ISAPI and NSAPI are modules written for Windows and Netscape web servers, respectively.

Caching

http://www.pacificnet.net/~johnr/meta.html

http://www.webthing.com/tutorials/cgifaq.3.html#19

Displaying images

Use the WIDTH and HEIGHT settings so the browser can display the text before going on and downloading the pictures.

Access to a SQL server

SQL servers are not optimized for intensive connection/operation/disconnection cycles, which is what CGI will do. Use a persistence engine so the connection/disconnection only occurs when starting/stopping the engine. Even better, find a web server that can talk directly to an SQL server so your application doesn't have to handle that part. Also, you might want to run the web server and the SQL server on differents hosts, and remember to set up the SQL server so that it only accepts requests coming from the web server (one less risk for your data to be hacked from the Net.) Obviously, you will have remembered to change the default username/password combo before linking the SQL server to the world...

Multiple Submits

When the user clicks on Submit more than once... http://www.webthing.com/tutorials/cgifaq.3.html#19

Tables

Don't use tables to display the result of long queries to a SQL server, as a table must be fetched in its entirety before being displayed. In this case, just display a sub-set of the SELECT, with the familiar "First, Prev 25, Next 25, Last" links.

Back button

Client-side code

Security

Can an .htaccess file be passed if the user has logged on cleanly from the main page, and just access a sub-directory as part of regular browsing, while a non-authenticated user snooping around will be prompted for login/password if hitting a sub-directory directly? The only difference with a cookie being that the login/password when being prompted by an .htaccess is volatile, and disappears once the browser is closed.
Buffer overflows caused by longer-than-expected parameters
Tampered-with JavaScript
Public access to callables than should be off-limit to non-authorized users

General infos

Which environment to use?

Why choose an environment like Zope over ASP.Net over PHP?

The former is an object-oriented web server, while the latter are just frameworks that work from a regular filesystem-based web server. This means that the server than handles scripts is constanly running instead of being launched by the frond-end web server whenever a script must be handled. This offers such bonanza as writing applications in an object-oriented way instead of just accessing scripts located on a file system.

For instance, on a regular web server, http://localhost/support/index means fetching and interpreting the index page located in the support/ sub-directory on the filesystem, while the same URL on a persistent server means calling support.index, ie. the index method of the support object.

From: kosh@aesaeion.com
Cc: zope@zope.org

Well I would disqualify all .NET stuff by default because they are proprietary technology and in my experience so far you will get burned EVERY SINGLE TIME with proprietary tech.

That basically leaves you with free software solutions. I prefer python code over php and perl for readability and ease of coding.

That leaves python based solutions of which there are a number. However I like being able to build applications that are pretty secure and zope has an entire security framework built in that you can tie into pretty easily and it has an OODB that it writes to which makes it easy to work with at least from my perspective.

Thus my choice is ZOPE.

Hello World in plain CGI

#!/usr/local/bin/python

def main():

print "Content-type: text/html"

print "<TITLE> Hello, World!</TITLE>"

print "Hello, World!"

if (__name__ == "__main__"):

main()

Resources

Books

Tools

DENIM - An Informal Tool For Early Stage Web Site and UI Design

Philip and Alex's Guide to Web Publishing
Software Engineering for Internet Applications by Eve Andersson, Philip Greenspun, and Andrew Grumet
Internet Application Workbook by Eve Andersson, Philip Greenspun, and Andrew Grumet

Database Backed Web Sites: The Thinking Person's Guide to Web Publishing
Front-End, Back-End, All Across the Net by Larry Seltzer
CGI Developer's Guide by Eugene Eric Kim
CGI Programming Unleashed by Eugene Eric Kim
CGI Manual of Style Online by Robert McDaniel
Special Edition Using CGI Written by Jeffry Dwight and Michael Erwin
Web Scripting Secret Weapons by Scott Walter
Web Programming Unleashed by Bob Breedlove, et al.
Web Programming Desktop Reference 6-in-1 by Michael Afergan, et. al.
Rich Clients
Echo - a framework for developing object-oriented, event-driven Web applications
I3ML - Creating Rich User Interfaces As Web Applications by David M Bridgeland
Cokinetic
The coming RIA wars: A roundup of the Web's new face by Dion Hinchcliffe
Free web servers for Windows: BRS WebWeaver, SHTTPD (Simple HTTPD), Lighttpd, KF Web Server
Know your Enemy: Web Application Threats
The case against Web apps - Five reasons why Web-based development might not be the best choice for your enterprise By Neil McAllister

Home

Developing Web-based applications

Best practices

Why write web applications?

Why not write web applications?

What kind of architectures are there?

Which tool to use?

Application server

What is it?

Why should I use it instead of writing code-filled pages?

OWASP's top risks list

Going from dedicated apps to web apps

Forms

Cookies

Sessions

Templates

Frameworks

Client-side code

Basic JavaScript

AJAX

Server-side code

CGI

wincgi

mod_cgi

mod_fastcgi

mod_scgi

mod_pcgi and PCGI

mod_python

ISAPI, NSAPI

Caching

Displaying images

Access to a SQL server

Multiple Submits

Tables

Back button

Client-side code

Security

General infos

Which environment to use?

Why choose an environment like Zope over ASP.Net over PHP?

Hello World in plain CGI

Resources

Books

Tools

Sites