Metadata Registry Development Docs
This page will be your guide to what the heck we're doing
The point of all this?
Right at the moment we're trying to do just a couple of things:
- Provide a meaningful way of registering controlled vocabularies for both the RDF and XML communities. This means RDF and XML output; concept/term labels and URIs are both required (if you don't have a URI or can't afford one, one will be assigned by the court); and stable, resolvable URIs
- Provide a structured environment that empowers self-defining communities, organizations, and individuals to collaboratively develop and publish controlled vocabularies
Infrastructure
(as of May 19, 2006)
- PHP 5.1.2
- MySQL 5.0.20
- Apache serving PHP (Registry site, mediawiki)
- lighttpd/fastcgi serving Python (trac)
- svnserve serving SVN
- symfony web development framework
Why PHP and not Python or even better Java?
That's a simple question with a complicated answer, but it boils down to the fact that some of the most exciting development is being done with PHP these days. IBM and Oracle have lined up behind PHP and are intent on making it a vehicle for creating enterprise-class software. PHP5 is much better than you think it is and PHP6 is going to rock. And in the end I (Jon) am personally more productive in PHP than I am in Python (the only other real choice). Although that probably wouldn't be true if I wrote in Python as often as I write in PHP. And Java is, well, just not the best tool for this kind of development.
Why MySQL and not Postgres or even better an RDF/XML data store?
I actually would prefer to work with Postgres, but we wanted to work with the xAMP stack and wanted to make it easy for others to pick up the software and be able to get going easily on a variety of platforms. While this is true to extent of Postgres, it's true to a greater extent with MySQL. Same logic holds for RDF/XML and in general the kind of data we're storing just doesn't need (I don't think anyway) the extra layer between the application and the database that most XML/RDF datastores impose. And the free ones aren't quite cooked enough in any event.
All of these points are certainly arguable and subject to the usual religious wars. So I expect there to be broad agreement and disagreement with this strategy.
Development and Release strategy
- Development and testing is done on a dedicated development machine behind the firewall by a single developer (at the moment)
- Code is checked into SVN whenever a feature/fix is enabled and passes unit and acceptance testing. This code is immediately available from SVN in trunk.
- The active production site is served from a symlink to a numbered folder containing the latest release code (e.g. 'htdocs/registry.0.037')
For release to the production server:
- Code is tagged as a release in SVN with the next release number
- Pre-release production code is uploaded to a new numbered pre-release directory on the server
- In the event of structural changes to the database, the production database is copied to a release-numbered copy of the database which is then altered via script
- The pre-release site is tested at a special testing URL
- When the pre-release site passes acceptance tests:
- If the database needs to be altered, the site is made read-only while the database is copied again, and the update script is run on the copy which becomes the production database
- the symlink is changed to point to the new release (e.g. 'htdocs/registry.0.038')
- The previous release code and database are left on the server so we can jump back to them in an emergency
- The second-to-last code and database are removed from the server
We probably won't do this more often than weekly monthly
Trac Milestones
We've been embarrassingly bad about keeping this stuff updated. But we're trying harder now.
Here's a brief guide to how we're using a couple of Trac Ticket features:
- Milestone
- When we expect to implement something. This is time driven rather than version driven and won't be consistently populated. It's a scheduling guide for the developers
- Version
- The release version that we expect to fix a bug or release a feature.
- Priority
- The order in which we expect to do things. This is a scheduling guide for the developers too.
- Severity
- How bad the defect is. This applies only to defects of course and has nothing to do with priority (well maybe a little bit).
