The decision has been made: your organization has decided to move forward with an eForm project. You have chosen an experienced implementation partner, a robust, scalable software vendor, and outfitted your end users with the correct mobile device hardware. Now the fun begins!
From day one your users will start to collect data, and a lot of it. A three-page legacy paper form can quickly turn into an eForm with 800 to 1,500 data fields. You need to be prepared, and this starts with designing and deploying the needed data infrastructure.
Choose a Naming Convention
There are a few best practices you can use when choosing a naming convention for the fields you create in your data warehouse. The field names within the form itself — and, as a result, when you extract from the vendor’s cloud or on-premise server — will depend on the vendor’s naming constraints.
First, choose a naming convention prefix. The first option is to name the fields by the page or page number where they exist. For example, if page three of the eForm has a field called Location Status, you could name that column in your database P3_Location_Status or P3LocationStatus. If the third page is Inspection, the naming convention could take on another form like Inspection_Location_Status or InspectionLocationStatus.
The other logical option would be to base the prefix on the section of the form that it is in. If Location Status is in the General Information section of the form, you could go with Gen_Info_Location_Status or GenInfoLocationStatus. Keep in mind that databases have naming constraints, and the naming conventions may vary by version (2005 vs. 2015) and software vendor (Microsoft vs. Oracle vs. SAP).
Both naming conventions have pros and cons. If your field could move pages often, then the section naming would be best. If your sections or areas are prone to change, then by page name is best. Select one and stick with it, because a combination of naming conventions is confusing for the reporting and integration teams.
Another best practice is to add a suffix that declares the data type. This is beneficial for both reporting and integration. Common suffixes include _txt, _date, _image, _picklist, etc.
Avoid Large Tables
Another issue you may come up against is dealing with very large forms. For example, with Microsoft SQL server and many other databases, table size is limited to a certain number of columns (Microsoft SQL Server tables have a maximum of 1024 columns). To avoid these limitations, use sparse column types or linked tables. Of course, only bring in the columns you really need to avoid moving and storing extraneous data.
At Confluent, we typically start with one table per eForm. This approach brings the data into the database table for that eForm. From there, we move it into other backend systems or use the table for advanced data analysis. We also use a combination of triggers, stored procedures, and integration packages to clean and move the data to the needed location, whether that is in our Azure-hosted data center, or locally to the client’s network.
That being said, sound architecture generally lends itself smaller tables. Do not try to cram multiple form types or use cases into one huge table. This will lead to limited future changes, and when a change is needed it will take a tremendous amount of time and testing.
Plan Ahead
Planning for large datasets and copious forms at the beginning will save a ton of time in the end. You may think you are only going to do one small, simple form in the beginning. However, as other departments or operational areas see the power of eForms, your solution will grow in size and complexity.