What is Data Processing
Data processing is simply the conversion of raw data to meaningful information through a process. Data is manipulated to produce results that lead to a resolution of a problem or improvement of an existing situation. Similar to a production process, it follows a cycle where inputs (raw data) are fed to a process (computer systems, software, etc.) to produce output (information and insights).
Generally, organizations employ computer systems to carry out a series of operations on the data in order to present, interpret, or obtain information. The process includes activities like data entry, summary, calculation, storage, etc. Useful and informative output is presented in various appropriate forms such as diagrams, reports, graphics, etc.
Need of data processing
Data processing is important in business and scientific operations. Business data is processed repeatedly, and usually needs large volumes of output. Scientific data requires numerous computations, and usually needs fast-generating outputs.
Related: Data Processing Cycle
Data processing methods
1. Manual Data Processing
In manual data processing, data is processed manually without using any machine or tool to get required results. In manual data processing, all the calculations and logical operations are performed manually on the data. Similarly, data is transferred manually from one place to another. This method of data processing is very slow and errors may occur in the output. Mostly, is processed manually in many small business firms as well as government offices & institutions. In an educational institute, for example, marks sheets, fee receipts, and other financial calculations (or transactions) are performed by hand. This method is avoided as far as possible because of the very high probability of error, labor intensive and very time consuming. This type of data processing forms the very primitive stage when technology was not available or it was not affordable. With the advancement in technology the dependency on manual methods has drastically decreased.
2. Mechanical Data Processing
In mechanical data processing method, data is processed by using different devices like typewriters, mechanical printers or other mechanical devices. This method of data processing is faster and more accurate than manual data processing. These are faster than the manual mode but still forms the early stages of data processing. With invention and evolution of more complex machines with better computing power this type of processing also started fading away. Examination boards and printing press use mechanical data processing devices frequently.
3. Electronic Data Processing
Electronic data processing or EDP is the modern technique to process data. The data is processed through computer; Data and set of instructions are given to the computer as input and the computer automatically processes the data according to the given set of instructions. The computer is also known as electronic data processing machine.
This method of processing data is very fast and accurate. For example, in a computerized education environment results of students are prepared through computer; in banks, accounts of customers are maintained (or processed) through computers etc.
Related: Methods of data collection
Methods of Data Processing by electronic means –
1. Batch Processing
Batch Processing is a method where the information to be organized is sorted into groups to allow for efficient and sequential processing. Online Processing is a method that utilizes Internet connections and equipment directly attached to a computer. It is used mainly for information recording and research. Real-Time Processing is a technique that has the ability to respond almost immediately to various signals in order to acquire and process information. Distributed Processing is commonly utilized by remote workstations connected to one big central workstation or server. ATMs are good examples of this data processing method.
2. Online Processing
This is a method that utilizes Internet connections and equipment directly attached to a computer. This allows for the data stored in one place and being used at altogether different place. Cloud computing can be considered as a example which uses this type of processing. It is used mainly for information recording and research.
3. Real-Time Processing
This technique has the ability to respond almost immediately to various signals in order to acquire and process information. These involve high maintainance andupfront cost attributed to very advanced technology and computing power. Time saved is maximum in this case as the output is seen in real time. For example in banking transactions
4. Distributed Processing
This method is commonly utilized by remote workstations connected to one big central workstation or server. ATMs are good examples of this data processing method. All the end machines run on a fixed software located at a particular place and makes use of exactly same information and sets of instruction.
Data Processing Cycle
The Data Processing Cycle is a series of steps carried out to extract information from raw data. Although each step must be taken in order, the order is cyclic. The output and storage stage can lead to the repeat of the data collection stage, resulting in another cycle of data processing. The cycle provides a view on how the data travels and transforms from collection to interpretation, and ultimately, used in effective business decisions.
Stages of the Data Processing Cycle
It is the first stage of the cycle, and is very crucial, since the quality of data collected will impact heavily on the output. The collection process needs to ensure that the data gathered are both defined and accurate, so that subsequent decisions based on the findings are valid. This stage provides both the baseline from which to measure, and a target on what to improve.
Some types of data collection include census (data collection about everything in a group or statistical population), sample survey (collection method that includes only part of the total population), and administrative by-product (data collection is a byproduct of an organization’s day-to-day operations).
It is the manipulation of data into a form suitable for further analysis and processing. Raw data cannot be processed and must be checked for accuracy. Preparation is about constructing a dataset from one or more data sources to be used for further exploration and processing. Analyzing data that has not been carefully screened for problems can produce highly misleading results that are heavily dependent on the quality of data prepared.
It is the task where verified data is coded or converted into machine readable form so that it can be processed through a computer. Data entry is done through the use of a keyboard, digitizer, scanner, or data entry from an existing source. This time-consuming process requires speed and accuracy. Most data need to follow a formal and strict syntax since a great deal of processing power is required to breakdown the complex data at this stage. Due to the costs, many businesses are resorting to outsource this stage.
It is when the data is subjected to various means and methods of manipulation, the point where a computer program is being executed, and it contains the program code and its current activity. The process may be made up of multiple threads of execution that simultaneously execute instructions, depending on the operating system. While a computer program is a passive collection of instructions, a process is the actual execution of those instructions. Many software programs are available for processing large volumes of data within very short periods.
5) Output and interpretation
It is the stage where processed information is now transmitted to the user. Output is presented to users in various report formats like printed report, audio, video, or on monitor. Output need to be interpreted so that it can provide meaningful information that will guide future decisions of the company.
It is the last stage in the data processing cycle, where data, instruction and information are held for future use. The importance of this cycle is that it allows quick access and retrieval of the processed information, allowing it to be passed on to the next stage directly, when needed. Every computer uses storage to hold system and application software.
Related: Data Management Best Practices
Data Processing System
A data processing system is a combination of machines and people that for a set of inputs produces a defined set of outputs. The inputs and outputs are interpreted as data, facts, information, depending on the interpreter’s relation to the system.
A data processing system may involve some combination of:
- Conversion converting data to another format.
- Validation – Ensuring that supplied data is “clean, correct and useful.”
- Sorting – “arranging items in some sequence and/or in different sets.”
- Summarization – reducing detail data to its main points.
- Aggregation – combining multiple pieces of data.
- Analysis – the “collection, organization, analysis, interpretation and presentation of data.”.
- Reporting – list detail or summary data or computed information.
Commercial Data Processing
Commercial data processing involves a large volume of input data, relatively few computational operations, and a large volume of output. For example, an insurance company needs to keep records on tens or hundreds of thousands of policies, print and mail bills, and receive and post payments.
In a science or engineering field, the terms data processing and information systems are considered too broad, and the more specialized term data analysis is typically used. Data analysis makes use of specialized and highly accurate algorithms and statistical calculations that are less often observed in the typical general business environment.
Almost all fields
It is impossible to think of any field which is untouched by data processing or its use. Let it be agriculture, manufacturing or service industry, meteorological department, urban planning, transportation systems, banking and educational institutions. Data processing is required at all places with varied level of complexity.