Our Experience

Since our founding in 2004, we have completed numerous software development projects, mainly in the fields of Information Retrieval and Decision Support Systems. Some of these have been described below. Most of the projects have been done for demanding UK customers. For some of them we have been working for over seven years now.

 

Taxonomy-based Classification and Search Software

Client: A leading UK supplier of taxonomy-based classification and search software

Project Goal: Develop a set of tools for: (1) Taxonomy and ontology authoring; (2) Document classification; (3) Enterprise search.

Delivered Solution: A set of applications that can be combined in many ways to match particular client's requirements. The applications come with powerful wizards and with import/export tools. Various databases and platforms are supported, and Windows software has been Vista-certified.

(1) Taxonomy and ontology authoring: The applications are speed optimised to support taxonomies consisting of millions of terms, internationalised, desktop as well as client/server, compatible with industry-standard taxonomy and ontology formats, offer workflow capabilities as well as dozens of other features.

(2) Document classification: This part consists of two components: Rule Creator (generates categorisation rules from a semantic model) and Document Processor (classifies documents basing on prepared rules). Rule Creator is quality- and speed-optimised and is capable of incremental data processing. Caches built around b-trees and other advanced algorithms have been introduced to support these goals.

(3) Enterprise search: Classification results and semantic model are searchable in real-time; this has been achieved with a distributed C/C++ service making use of dedicated indexes. The system is integrated with Google Search Appliance (GSA), Microsoft SharePoint and with other portals and search engines.

Technology: C/C++, Python, wxWidgets, Oracle, MySQL, MySQL Embedded, MS SQL Server, MS Windows, Solaris

Duration: 302 person-months

"Tomasz' skills and professional approach allowed us to solve numerous problems in good time and I am happy to recommend him as a bussiness partner."

"I have been very impressed with the way you have worked on this project - Thank you. You have been very cooperative and professional. You have picked up the work quickly and always came back with very sensible questions and points for clarification."

Knowledge Management System

Client: A UK-based educational institution that facilitates cooperation of scientists on historical research

Project Goal: Create a publishing platform that would allow its users to discover and manage connections between their work and the works of others. This is supposed to be achieved with computer-aided reference discovery to people, places, dates, events, etc. The size and complexity of data sets provided require both high performance and flexibility.

Delivered Solution: A Knowledge Management System that allows for very flexible data structure definitions and provides its users with convenient means of editing their data. While the system places virtually no restrictions on data structure complexity, at the same time it supports very detailed and precise querying. High performance is maintained by use of indexing as well as of distributed computing. Desktop as well as web-based user interfaces have been developed.

Technology: Python, MySQL, BerkeleyDB, wxWidgets, MS Windows, Linux

Duration: 326 person-months

"I greatly recommend RD Projekt as a IT Solution and Services Provider Company."

"I am grateful for all the hard work that the team has put in to achieving the deadlines, and look forward to continuing to work together in the future."

"You have a super set of people working for RD Projekt."

Distributed Web Crawler

Client: A German supplier of multi-dimensional search and trend tracking technologies for enterprises and consumers

Project Goal: The company needed an application that would crawl the web and find documents that might be of a interest to its customers.

Delivered Solution: A modular system with two main modules: Downloader and Analyser, connected with each other using persistent message queues. A special care has been paid to efficiency and it has resulted in a system that can process 200.000 pages per hour on a 4-processor machine connected to the Internet using a 100MBit link. This has been achieved by running multiple download requests in one process using lightweight threads (Eventlet) and distributing parsing jobs to multiple processors (Celery + multiprocessing). An ultra-high-speed probabilistic set implementation (Bloom filter) has been introduced to avoid requesting the same web page twice. To further speed up downloading, connection pooling and custom DNS request handling has been implemented. The system has been designed to be fail-safe, i.e. to resume processing from the point where it was (even unexpectedly) interrupted. It can be deployed to multiple machines to further speed up analysis of documents. In such approach each machine can be separately configured to execute a selected set of tasks and each task can be distributed to multiple machines. Using persistent message queues results in load balancing across the machines and in making it possible to continue processing flawlessly when some of the machines fail. The most efficient configuration, allowing to process approximately 1 million pages per hour, consists of two database servers and of four to six machines for downloading and analysing web pages.

Technology: Python, Celery, Eventlet, LXML, ZeroMQ, MySQL, MongoDB, RabbitMQ, Bloom filters, Linux

Duration: 4 person-months

Document Transformation System for enterprises

Client: A German supplier of multi-dimensional search and trend tracking technologies for enterprises and consumers

Project Goal: The company needed a system that, given a set of structured documents of different formats, would automatically select appropriate transformation routines for each of them and then perform distributed processing of the entire document set. Data sources included databases, CSV files, Microsoft Sharepoint servers, and others.

Delivered Solution: First, the system extracts documents from a given data source and produces data records from them. Given an input record, the system looks up the ontologies describing the data schema based on the namespaces found in the file. It computes a set of appropriate transformation routines by matching the ontologies with transformation routine metadata. During this step, an Ontology Reasoner infers additional ontology properties to extend the set of possible transformations. Once a set of transformations is selected, an execution plan is created and saved in the repository. Then the transformations are applied according to the plan and the records are efficiently transformed using multiple machines, connected with each other using message queues. Each of the machines can be configured to perform a selected set of tasks. Having multiple machines allows us to process documents as quickly as data sources are able to expose them.

Technology: Python, C++, Java, JCC, OWL, Microsoft SharePoint, Linux

Duration: 9 person-months

Media Recommendation System

Client: Innovative British start-up delivering Video-on-Demand services via web and mobile channels

Project Goal: The company needed a system that would guide its users to broadcasts of interest.

Delivered Solution: Recommendation system that downloads content descriptions from media providers, classifies the content by performing advanced text and statistical analysis and then recommends the broadcasts to users. In addition, specialised GUI tools have been implemented to supervise and fine-tune the classification process.

Technology: Java, Python, wxWidgets, MySQL, MS Windows, Linux

Duration: 10 person-months

"You've done a great job [...] and it looks like it's just what we need."

Back-office software platform for a travel agent

Client: One of the leading hotel, business travel and conference booking agents in the United Kingdom

Project Goal: Create a back-office software platform that allows to combine and uniformly represent all booking and financial related data. The data was coming from various other third party systems, both internal and external, in many different formats. The goal was to have all that data standardized making it easily accessible by various company departments, including: customer support, invoicing, sales, etc. The scope of the system included web-based GUI tools for company employees, allowing them to examine and modify the data, as well as batch script programs for automated tasks, such as credit card charging, invoice generation, interest rate calculation, booking cancellation, etc.

Delivered Solution: A set of system components has been implemented entirely or partially by RD Projekt. This includes: SOAP-compatible web-services enabling single-point of uniform access to back-office data related to clients and suppliers; automatic batch tools for VAT and interest fees calculation; credit card fee charging data export in Air Plus format; web-based GUI application for charging/refunding customer credit cards; invoice generator calculation and scheduling tool; as well as a set of libraries for system-wide use, facilitating rapid code development.

Technology: Java, Spring, SOAP, FireStorm/DAO, MS SQL, MS Windows

Duration: 4 person-months

"Tomasz has built and outstanding delivery capability providing solutions on most platforms and databases. I would commend him to any potential client without reservation. The quality and timeliness of his operations deliveries are an asset to anyone."

.NET-based Knowledge Extraction System

Client: A UK-based educational institution that facilitates cooperation of scientists on historical research

Project Goal: The client needed a tool to extract measurable information about people and places from English Wikipedia. This information has then been used to populate the knowledge database used by historicians.

Delivered Solution: RD Projekt has created a spider-like tool to browse the Wikipedia archive. A custom AI module has been used to identify entries about people and places. The results were further fine-tuned to include only figures and locations of a historical importance. Attributes such as places, dates, geographical coordinates, references etc. have been extracted from Wikipedia articles and then aggregated into tables. A separate, large part of the algorithm was dedicated to converting human-readable dates to absolute date values that could be saved and compared. To verify the results of the automatic extraction, a separate tool has been developed with a dedicated GUI that allowed convenient and quick verification of extracted data, along with a preview of related Wikipedia pages and information highlighting.

Technology: .NET, IronPython, MS Windows, wxWidgets

Duration: 4 person-months

Procurement Optimisation System

Client: McKinsey - a leading international consultancy. RD Projekt acted as a subcontractor on the cost-cutting project conducted by this consultancy for a large international corporation.

Project Goal: The international corporation needed to purchase many different services in many cities around the world. For each city and service, a supplier had to be chosen in a way that all the necessary services were provided and the global cost of them was as low as possible. The project goal was to create a system that would automatically solve this problem. The suppliers offered volume discounts and this made the problem computationally difficult ("NP-hard") because a solution locally optimal was not necessarily a part of the global solution.

Delivered Solution: A Genetic optimiser that finds best (or near-best) solutions for various scenarios and constraints selected by the user

Technology: C++, Python, VBA, MS Windows

Duration: 2 person-months

Automated Brand Monitoring on the Web

Client: A Polish media monitoring agency serving many large companies

Project Goal: The agency needed a system that would monitor a set of web sites and immediately inform its operators when new articles (or readers' comments) appear that satisfy given criteria. The set of monitored web pages along with these criteria had to be fully configurable.

Delivered Solution: An application that consists of a Web Crawler module, Text Analysis module and GUI. The contents of each web site are periodically checked and articles satisfying user-defined criteria are tracked.

Technology: C++, wxWidgets, MS Windows

Duration: 2 person-months

Enhancing instant messenger with VoIP using GIPS Voice Engine

Client: A company running one of the most popular instant messengers (more than 6 million users)

Project Goal: Previous in-house efforts of implementing VoIP haven't resulted in satisfactory sound quality so it was decided that VoIP should rely on GIPS Voice Engine.

Delivered Solution: GIPS integrated with the IM in a way that preserves compatibility with older IM client versions

Technology: C/C++, MS Windows

Duration: 2 person-months

Additional modules for Portfolio Management System

Client: A Polish financial group developing investment funds and delivering brokerage and asset management services

Project Goal: The client needed a software module to manage allocations of financial instruments across investors' portfolios. Since there are always multiple solutions possible, the module would need to find the best one (according to various evaluation criteria).

Delivered Solution: RD Projekt has implemented the module that is still in use by the client (according to RD Projekt's knowledge).

Technology: C#, MS SQL Server

Duration: 1 person-month


Also, team members of RD Projekt have considerable experience from their previous companies in building complex systems for European and American clients. These projects are listed below. It is important to remember that these are not RD Projekt projects, but the RD Projekt people were heavily involved as key team members in all of these undertakings.

Data Warehouse for behavioral analytics of instant messaging platform's users

Client: One of the largest Polish instant messaging platforms

Technology: Hadoop, Hive, Linux

Intelligent Internet Search Engine

Client: American investor

Technology: Java, Oracle

A System Supporting Crime Detection On The Internet

Client: Pinkerton (leading American detective agency)

Technology: Java, Oracle

Sales And Marketing Optimisation Systems

Clients: American institutions

Technology: Java, Oracle

Linux-Based thin clients

Client: European investor

Technology: C

A Marketing Optimisation System

Client: PKN Orlen S.A. (the largest Polish manufacturer and distributor of fuel)

Technology: PHP.MVC, MS SQL Server with Analysis Services, Transact-SQL

Controlling And Sales Reporting Systems

Client: PGNiG S.A. (Poland's largest Oil & Gas company)

Technology: Oracle Express, Visual Basic, Oracle Financial Analyzer

Activity-Based Costing System

Client: Large Polish telecom

Technology: ASP, VB, C++

Cash Pooling (Liquidity Management), Internet Banking and Payment Processing applications

Client: Large international bank

Technology: J2EE, Oracle

© RD Projekt 2011. All rights reserved