Developing Web-based applications
|
|
Best practices
- Use HTTPS for the login page to prevent eavesdropping; Once the user
has been successfully authenticated, generate a session ID and send it to
the client as a cookie, and go back to HTTP, for faster performance
- Passwords and other sensitive data like credit card data should be saved
encrypted in the database or in an include file outside the web server's
docs directory
- Use POST to send data to the server
- Sanitize input data with eg. mysql_real_escape_string(), as hackers
will try to send SQL, exceedingly-long variables, etc. Generally speaking,
don't trust user input (SQL injection, cross-site scripting attacks, etc.)
- On a subscription server, don't assume that the user will visit pages
in a particular order, and check user acces in every page/controller
using session ID
- Remember that a user may access the application multiple times at the
same time with the same user account, either through multiple windows of
the browser on one computer, or through multiple computers
- Do not save anything in cookies appart from a session ID, which can be encrypted
for added security
- Session ID's should be time-limited so that a client host left open
should expire after X minutes of non-activity
- To keep things tidy, use the same page/controller to display a form,
check its input, compute, and send a reply
- Do not use SELECT *: The order and position in which columns are returned
does not remain the same if you add, move, or delete columns. A simple change
to your table structure could cause your application to fail
- Watch out for the back button
- Anticipate that your site may become much more popular and might need
to run on multiple web servers connecting to multiple DB servers with write
requests handled by more than one DB server...
Links on security:
Why write web applications?
- Ease of deployment since, if you pay attention not to use browser- and
version-specific pages, a web application can be accessed from any host
that has a web browser preinstalled
- Any update to the code only needs to be performed on the server(s)
- Widgets supported by browsers are not as sophisticated as those available,
which in a way is a good thing because users only need to know how to click,
hence reduced training and support calls
- Make sure you separate business (code), presentation (the actual input/output
as displayed by the web browser), and data access using templates
- Frameworks like Enhydra, Midgard, or Zope provide tools to speed up
development over plain languages like Perl or PHP. Multiple templating systems
are also available for Perl, PHP, and Python. Read about the Struts
framework to know a bit more about the MVC (Mode View Controller) model
Why not write web applications?
R.I.P. Browsers
by Miss Rogue
What kind of architectures are there?
- Server-side: CGI, embedded code in HTML pages (PHP, Perl, ASP, JSP,
etc.), application server (Java servlets, Zope, etc.)
- Client-side: embedded script (JavaScript, VBScript, AJAX), embedded binaries
(Java applets, ActiveX components)
Which tool to use?
There are basically
two kinds of web sites:
- Those whose purpose hasn't changed from its original goal, ie. serve
documents (except you should go from static to dynamic pages to make it
a snap to display documents in a layout that is consistent throughout the
site and easy to change, eg. adding a header + footer + navigation bar)
- Those that are mostly web applications, eg. Hotmail.
For the former, dynamic pages executed by a standard Web server is fine,
but for the latter, you should take a look at application servers: They usually
have their own embedded web server, but on bigger sites, are best run behind
a web server acting as front-end.
Application server
What is it?
- compiles all the code for the application, and keeps it in RAM
- the interpreter doesn't have to be started each time it's accessed
- remains in memory at all times, which improves performance and makes
it easier to maintain state information, ie. solving the statelessness of
HTTP, and keep open connections to the database
- URLs can be turned into calls to objects and methods, and hence,
look a lot more familiar to developers who have no experience with web applications
but plenty of experience with dedicated applications written in OOP
- Is required for languages other than PHP when running with Apache, as
they are just too slow compared to PHP
Why should I use it instead of writing code-filled pages?
Traditionnally, web applications were written by creating pages in the web
server's htdocs directory that the browser would call, and that the interpreter
would execute before returning the output. For example:
- /index.php is called by the client
- The web server detects that this page contains PHP code, and has it
executed by the PHP interpreter
- The PHP interpreter outputs data to the web server which is then passed
on to the browser for rendition.
Without an application server, each page must be called and compiled each
time it is called by the application, and state information is handled by saving
a session ID and extra information into a database.
As a pratical example, page1.php shows a login/password page and a Next button
which calls page2.php that will connect to a database to check the credentials.
Problem is, a regular web server has no mecanism for pages to communicate, so
there's no way for page1.php to send the login/password to page2.php. Various
solutions have been found to solve this problem, but using an application server
is a better solution if you have the choice of running your own web server.
More information
OWASP's top risks list
From the Open Web Application Security Project
.
- Invalidated parameters: Failure to validate information from a Web requests
before these are used by a Web application. Attackers can use these flaws
to attack backend systems through a Web application.
- Broken access control: Restrictions on what authenticated users are
allowed to do are often not properly enforced. Attacks use this to access
other users' accounts, view sensitive files or run unauthorised functions.
- Broken account and session management: Account credentials and session
tokens left without proper protection, leading to the risk that crackers
could assume victims' identities.
- Cross-site scripting flaws: A modern classic - mistakes here mean Web
applications can be used as a mechanism to steal session tokens, attack
a local machine or spoof content.
- Buffer overflows: Arguable the most common type of security risk (so
why isn't it number one? Ed). Sloppy programming means applications fail
to properly validate inputs - so maliciously constructed, malformed requests
can crash a process and be used to inject hostile code into target machines.
- Command injection flaws: If an attacker can embed malicious commands
in parameters passed to external systems these may be executed on behalf
of a web application, to unpleasant effect.
- Error handling problems: If an attacker can cause errors which
are improperly handled, all manner of mischief (information disclosure,
system crashes etc.) might be possible.
- Insecure use of cryptography: Web apps frequently use cryptography.
If that's not coded properly, sensitive information won't be adequately
protected.
- Remote administration flaws: If remote Web admin tools are insecure
then an attacker stands a chance of gaining full access to all aspects of
a site.
- Web and application server misconfiguration: Don't trust out of the
box security
Going from dedicated apps to web apps
Forms
There are two ways to check input from a form, eg. checking that all the
required fields were filled correctly:
- The traditional way, which is to call a second page using the ACTION
item in a FORM tag which is called after the use has it a submit button.
This requires a round-trip for all the incorrect fields
- The more advanced way, which is to embed some JavaScript code in the
form, and call this code when the user clicks on the submit button, without
letting the user upload the results until all the fields are correctly filled.
Here's a unfinished, not-workable sample:
-
- <form name="my_form" id="my_form"
-
- onSubmit="
- var ok=0;
- for(var i=0;i<this.form.length;i++) {
- if(this.form.elements[i].value=='')
{
- alert('Remplir
le champ : '+this.form.elements[i].name);
- this.form.elements[i].focus();
- this.form.elements[i].select();
- ok=1;
- return
- }
- }
-
- if(ok>0) {
- alert('Envoi impossible\nCompleter
le formulaire');
- return false;
- }
- else return true;
- }
- </form>
Cookies
What
are cookies and how can I use them in my scripts?
Sessions
What if you need to restart the server (either plain or persistent)? If user
server-side sessions with no database, all information is lost...
Templates
What
are templates and how can I use them?
Frameworks
Client-side code
Basic JavaScript
Since ActiveX and Java applet never really made it, the only game in town
is JavaScript. Apart from some similar-looking syntax, JavaScript has nothing
to do with Java, and especially, does not require a compiler to be installed
on the client host for the downloaded code to run. Instead, JavaScript is clear-text
code embedded in web pages which is run after it's been downloaded as part of
the web page. Its main use is to build sexier interfaces (eg. trees, WYSIWYG
editors, etc.), write event-driven code (eg. copying into a text box the value
of the item that was selected in a combobox), and check the contents of a form
and only upload the result once the user has typed entered parameters (relying
on the standard <FORM ACTION=myscript.html"> translates into multiple
round-trips between the server and the client until all the fields are correctly
filled.)
AJAX
Server-side code
CGI
The Common Gateway Interface, or CGI, is a standard for external gateway
programs to interface with information servers such as HTTP servers. The reason
you should look at alternatives, is that a CGI call forks a new process (or
launches a new thread) each time a script is called, offering terrible performance.
wincgi
Looks like a way to write CGI scripts in VB. http://docs.rinet.ru:8083/UCGI/ch14.htm
mod_cgi
mod_cgi (Dynamic
Content with CGI) is the Apache module to run scripts through CGI, which
is the slowest method to run scripts.
mod_fastcgi
mod_fastcgi
(FastCGI:
A High-Performance Web Server Interface) is a faster method by having a
server constantly running serving dynamic content when requested by the web
server; This increases performance and provide persistance.
mod_scgi
mod_scgi
is yet a faster solution (mod_scgi is the Apache module, while the "scgi"
package is the server part, written in Python.)
mod_pcgi and PCGI
mod_pcgi is a module, while PCGI
is a CGI script (stuff built by the Zope team.)
mod_python
mod_python is an Apache module that
embeds the Python interpreter within the server. With mod_python you can
write web-based applications in Python that will run many times faster than
traditional CGI and will have access to advanced features such as ability to
retain database connections and other data between hits and access to Apache
internals.
Other Apache modules to run Python scripts are mod_snake
and PyApache (which seems to
be no longer developed), but mod_python
seems the best choice of the three.
All of those modules are just ways to connect Apache with Python scripts,
but are not frameworks in themselves.
The current main differences between mod_snake and mod_python are:
- mod_snake works in both Apache 1.3 and 2.0 series. Mod_Python currently
only runs in Apache 1.3.
- mod_python runs under Windows -- mod_snake does not.
- mod_snake allows Python written modules to create their own configuration
directives, and gives the modules more power similar to that of C-style
Apache modules (via per-dir configs, etc.) mod_python only allows
for mod_perl style callbacks.
ISAPI,
NSAPI
ISAPI and NSAPI are modules written for Windows and Netscape web servers,
respectively.
Caching
http://www.pacificnet.net/~johnr/meta.html
http://www.webthing.com/tutorials/cgifaq.3.html#19
Displaying images
Use the WIDTH and HEIGHT settings so the browser can display the text before
going on and downloading the pictures.
Access to a SQL server
SQL servers are not optimized for intensive connection/operation/disconnection
cycles, which is what CGI will do. Use a persistence engine so the connection/disconnection
only occurs when starting/stopping the engine. Even better, find a web server
that can talk directly to an SQL server so your application doesn't have to
handle that part. Also, you might want to run the web server and the SQL server
on differents hosts, and remember to set up the SQL server so that it only accepts
requests coming from the web server (one less risk for your data to be hacked
from the Net.) Obviously, you will have remembered to change the default username/password
combo before linking the SQL server to the world...
Multiple Submits
When the user clicks on Submit more than once... http://www.webthing.com/tutorials/cgifaq.3.html#19
Tables
Don't use tables to display the result of long queries to a SQL server, as
a table must be fetched in its entirety before being displayed. In this case,
just display a sub-set of the SELECT, with the familiar "First, Prev 25,
Next 25, Last" links.
Back button
Client-side code
Security
- Can an .htaccess file be passed if the user has logged on cleanly from the
main page, and just access a sub-directory as part of regular browsing, while
a non-authenticated user snooping around will be prompted for login/password
if hitting a sub-directory directly? The only difference with a cookie being
that the login/password when being prompted by an .htaccess is volatile, and
disappears once the browser is closed.
- Buffer overflows caused by longer-than-expected parameters
- Tampered-with JavaScript
- Public access to callables than should be off-limit to non-authorized
users
General infos
Which environment to use?
Why choose an environment like Zope over ASP.Net over PHP?
The former is an object-oriented web server, while the latter are just frameworks
that work from a regular filesystem-based web server. This means that the server than handles scripts is constanly running
instead of being launched by the frond-end web server whenever a script
must be handled. This offers such bonanza as writing applications in an object-oriented
way instead of just accessing scripts located on a file system.
For instance, on a regular web server, http://localhost/support/index
means fetching and interpreting the index page located in the support/ sub-directory
on the filesystem, while the same URL on a persistent server means calling support.index,
ie. the index method of the support object.
Well I would disqualify all .NET stuff by default because they are
proprietary technology and in my experience so far you will get
burned EVERY SINGLE TIME with proprietary tech.
That basically leaves you with free software solutions. I prefer
python code over php and perl for readability and ease of coding.
That leaves python based solutions of which there are a number. However
I
like being able to build applications that are pretty secure and zope
has
an entire security framework built in that you can tie into pretty
easily
and it has an OODB that it writes to which makes it easy to work with
at least from my perspective.
Thus my choice is ZOPE.
Hello World in plain CGI
- #!/usr/local/bin/python
-
- def main():
- print "Content-type: text/html"
- print
- print "<TITLE> Hello, World!</TITLE>"
- print "Hello, World!"
-
- if (__name__ == "__main__"):
- main()
Resources
Books
Tools
Sites