DataCollector

Tags

computer programmer, computer programming, data, Java, software, software design, software development

When I first posted about my DataBridge utility I mentioned the DataCollector, which was a Java-based framework for quickly building apps that interacted via web services with a third-party CRM services provider.

In this post I’ll introduce the DataCollector framework. For obvious proprietary reasons, this will be fairly generic, but I think the basic architecture is worth sharing. It’s a nice example of using factory classes.

This one is complex enough I’ll only be able to cover the basics.

The Background

CRM stands for Customer Relationship Management, and it’s a department (or process) used formally or informally by any company with customers. The name says what it is: Managing your Relationship with your Customers.

When CRM is formalized, it involves basic concepts such as:

Account: Any distinct customer or company. For example, any of the following might have an account at Amazon, Netflix, or Home Depot: individuals, families, churches, small companies, large companies. Accounts have names, addresses, phone numbers, web sites, etc.
Contact: Always a distinct person. Contact and Account are the same for individuals, but group Accounts have multiple Contacts. Contacts have names, phone numbers, titles, etc.
Lead: Kind of the heart of CRM and certainly the heart of Sales. A Lead is perceived chance to make a sale. A Lead involves a Contact at an Account. It also involves a Product.
Opportunity: A Lead becomes an Opportunity when a sale appears likely. The data is essentially the same as a Lead. Opportunities use a different workflow and generally aren’t transient.
Sale: An actual sale of a Product to a Contact at an Account.
Product: The things you sell.

There can be additional types, but those are the core of CRM. Databases that support CRM tend to center around tables with records of these types. (And seriously complicated interrelationships.)

The overall metaphor behind CRM is of a funnel. Lots of Leads come in the wide mouth of the funnel, but as things progress the funnel gets narrower and narrower. Lots of people express an initial interest, but few people actually buy.

People drop off at different points as actually spending money becomes more and more imminent. Hence the funnel. At some point in the funnel, a Lead becomes an Opportunity. The hope is that it reaches the funnel mouth and becomes a sale.

The Requirements

At the database level, there are a number of similarities between record types. For example, usually all record types have an ID as well as a Date-Created and a Date-Modified. They may also have version counters or other fields in common.

These common fields suggest a generic base record type to handle the common fields. Derived record types add fields needed by that record type. This results in a record type hierarchy.

The record types also share operations in common. For example, all can be created, queried, updated, or deleted. This implies a shared interface of common operations.

Many systems have three related record operations: INSERT, UPDATE, and UPSERT.

The first creates a new record, but requires that no matching record exist. If one does, the operation generates an error. The second modifies an existing record, but fails if no matching record exists. The third combines the first two: If a record exists, it’s updated; if not, it’s created.

The result is something like this:

Type	Query	Insert	Update	Upsert	Delete
Account	AcctQry	AcctIns	AcctUpd	AcctUps	AcctDel
Contact	CtctQry	CtctIns	CtctUpd	CtctUps	CtctDel
Lead	LeadQry	LeadIns	LeadUpd	LeadUps	LeadDel
Opportunity	OptyQry	OptyIns	OptyUpd	OptyUps	OptyDel
Sale	SaleQry	SaleIns	SaleUpd	SaleUps	SaleDel
Prod	ProdQry	ProdIns	ProdUpd	ProdUps	ProdDel

So the framework needs to represent all these record types and implement all the operations for each.

Ultimately what we want is the ability to write something like an SQL query against this remote database. Something like:

SELECT name, address.city, address.zip
FROM Account
WHERE address.state = 'MN'
ORDER BY name, address.zip, address.city

(Notice that some fields, like address, are made of sub-fields.)

The general program flow is:

Initialize: read input parameters
Log in to remote server with credentials
Make the request.
Receive the response (allow for timeouts and errors).
Log out from server.
Present the response data (or status if error).

This means we need to, at the least, create object instances at run-time based on text strings like “Account” or “LEAD” or “product” — users might use any case.

Part of what makes this interesting is that while record types have lots of fields (over 100 in most cases), requests to the web service only use the fields of interest.

For example, the query above uses only four fields from the Account record. The request to the web service would look very generally like this:

Query Account
    Name:
    Address:
        City:
        State: MN
        Zipcode:
End Query

The presence of the field in the query means we want that field’s value. Providing a value for a field in a query acts as a filter.

The response is similar, but the fields are filled in with the values from the database:

List Account:
    Account:
        Name: Frobosh Incorporated
        Address:
            City: St. Paul
            State: MN
            Zipcode: 55123
    Account:
        ...etc...
End List

Only requested fields are returned, but note that a query can result in a list of Accounts. Even if only one is found, the returned result is still a list (of one).

(The actual request and response are in web services XML.)

The record types, and their large number of fields, are defined in a vendor-provided WSDL file. This lends itself to automatic code generation to avoid having to write code for each field of each record type (of each operation).

A big part of this project had to be the ability to generate code on the fly based on text inputs.

Here’s an approach, but it’s one with problems:

Account acct = new Account();

acct.setName1("");
acct.setAddressCity("");
acct.setAddressState("MN");

Account_Query qry = new Account_Query();
qry.login(UserName, Password);

Account[] data = qry.get(acct);
qry.logout();

Account.sort(data, OrderByFields);

for d in data {
    print(d.getName1(), d.getAddressZip(), d.getAddressCity());
}

The main problem is the need to hard-code even the simplest query as a new and separate Java project. There’s no good way to automatically handle the fields we need (among the 100+ available). Even the output is hard-coded.

Note that this code punts on the sorting with the mysterious OrderByFields variable. That would be a list of field names, which again shows the need to access fields by name.

Here’s a better approach:

Record recd = Record.newRecord(Type);

for (f in SelectFields) {
    recd.setField(f.name, "");
}

for (f in WhereFields) {
    recd.setField(f.name, f.value);
}

Query qry = Query.newQuery(Type);
qry.login(UserName, Password);

Record[] data = qry.get(recd);
qry.logout();

SortRecords(data, OrderByFields);

for (d in data) {
    print("Record:");
    for (f in SelectFields) {
        print(d.getField(f.name));
    }
}

This code works with any Record type. The Type variable is the record name (for example, “Account”). The SelectFields, WhereFields, and OrderByFields, variables are all lists of name/value pairs.

That means the entire query can be parameterized by strings, which is exactly what’s needed.

I’ll leave it there for now. Explaining the factory classes (and the factory factory class) will take a post of its own.

But here’s a quick taste:

TypeFactoryFactory FF = new TypeFactoryFactory();
ITypeFactory TF = FF.getTypeFactory(Type);

IQuery qry = TF.getQuery();

IRecord rcd = TF.getRecord();
//...etc...
//

The use of Interfaces makes the code 100% generic. The Type variable, again, would be something like “Account” or “Lead”.

I was really proud of this project; it turned out even better than I imagined.

The irony is that I was forced into it as a last-minute save, had to start from scratch, and was under serious time and corporate expectations pressure. A solution had been promised, but the assigned programmer really munged things up (and was transferred out of our department as a result).

Not the first time that happened, actually, but that’s other stories.

With projects this complex, I’m very thankful for a language like Java and a really good programmer’s IDE (such as Eclipse). I also use GVim for heavy editing sessions.

Good tools make it much easier to do good work!

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29

The Hard-Core Coder

~ I can't stop writing code!

DataCollector

The Background

The Requirements

2 thoughts on “DataCollector”

Over to you... Cancel reply

The Background

The Requirements

Share this:

Related

2 thoughts on “DataCollector”

Over to you... Cancel reply