Custom search engine for legal research database

 
Custom search engine for legal research database

ActaPublica.se is Sweden’s largest online archive of legal records, collected over decades. The user can search and download millions of quality government documents with the help of different filter options including specific keywords.

A multitude of documents from numerous Swedish authorities is indexed and added to the archive daily.

This platform uses an Elasticsearch result rank algorithm to show the best matching documents on top of the result set. Users can also use API services to integrate the document search to other applications.

The application core is built on Laravel framework and uses Node.js for real-time communication.

Key features:

Wildcard search: The user can input wildcard expressions to yield better and more relevant results. S/he can use symbols like *, which could match a particular character sequence or ‘?’ which could match a single character from the keyword.

Agent feature: The user can save search criterias of interest for future updates, under ‘agent’ for easier search. They will be notified about new matching documents via real-time browser notifications and by email.

Application programming interface: Unique API access credentials can be used to access data from an account, to build custom services and solutions.

Live browser notifications: A real-time socket connection is used to push browser notifications to the user on finding new documents matching agent criteria.

Activity logs: The various stages of agent processing is logged in the backend to keep track of the agent processing events to aid for auditing and debugging.

Request throttling: All API/download requests can be regulated at the user/organization level by the backend administration.

Multilingual support: The application supports seven major languages – English, Danish, German, Spanish, French, Italian, Portuguese and Swedish. A user can easily switch between languages from the footer of the platform.

Technical information:

AWS services: The application is running on high-performance EC2 instance to provide a hassle-free experience to the end users. The system uses AWS S3 to store the documents and deliver results to the application upon requests for document previews and downloads.

Automated deployment: Releases are easier, faster, and automated by setting up automated deployments using the AWS Cloudformation and Bitbucket Pipelines. AWS Cloudformation is a service by AWS to provision AWS services in a secure manner. Bitbucket Pipelines is a CI/CD service.

Real-time data broadcasting: The notification alert will be broadcasted to the users through a socket channel and they will get notified via browser notifications. It is accomplished by integrating Node.js, Socket.io and Redis in the system.

Elastic cloud: The Elasticsearch store is hosted on Elastic cloud enabling us to use the latest Elasticsearch version and making available features like Elastic X-Pack, extensive monitoring capabilities, snapshotting, etc.

Database: The MySQL database of the application has been configured using Amazon RDS service for better data security and easy backup.

Responsive design: The website is built over the Bootstrap framework to provide a responsive and optimized user experience on all devices and orientations.

Coding standards: The project uses PSR-2 coding standards to keep the code readable and easily maintainable with proper code comments and PHPDoc blocks.

Future challenges

Improved UX – We would need to continually improve user experience based on feedback from users and industry updates to make it easier for users to get the best out of the platform.

Elasticsearch – We plan to use the modern features available in Elasticsearch queries to refactor the queries, for faster results and improving the general experience of the system.

Faster notifications – The aim is to further improve the turn-around time for mail notifications by making use of various modern solutions such as AWS Lambda.

Improve code coverage – We will make better use of design patterns like repository pattern and improve unit-test coverage.

Better scalability – LiteBreeze experts plan to take full advantage of AWS services like EC2 autoscaling to improve the scalability of the application.

Custom search engine for legal research database
We trust LiteBreeze with all our web development work in Siren and Acta Publica. They developed the archive service for Acta Publica which all Swedish media firms rely on daily. We are excited about ongoing work for Siren and recommend LiteBreeze for their AWS expertise. - Martin Fredriksson (Stockholm, Sweden)
Team of developers who worked on this project: Preeth, Bheem, Saji, Dileep, Archu