cloud computing, method of running application software and storing related data in central computer systems and providing customers or other users access to them through the Internet.

Early development

The origin of the expression cloud computing is obscure, but it appears to derive from the practice of using drawings of stylized clouds to denote networks in diagrams of computing and communications systems. The term came into popular use in 2008, though the practice of providing remote access to computing functions through networks dates back to the mainframe time-sharing systems of the 1960s and 1970s. In his 1966 book The Challenge of the Computer Utility, the Canadian electrical engineer Douglas F. Parkhill predicted that the computer industry would come to resemble a public utility “in which many remotely located users are connected via communication links to a central computing facility.”

For decades, efforts to create large-scale computer utilities were frustrated by constraints on the capacity of telecommunications networks such as the telephone system. It was cheaper and easier for companies and other organizations to store data and run applications on private computing systems maintained within their own facilities.

Internet http://www blue screen. Hompepage blog 2009, history and society, media news television, crowd opinion protest, In the News 2009, breaking news
Britannica Quiz
What Do You Actually Know About the Internet?

The constraints on network capacity began to be removed in the 1990s when telecommunications companies invested in high-capacity fibre-optic networks in response to the rapidly growing use of the Internet as a shared network for exchanging information. In the late 1990s, a number of companies, called application service providers (ASPs), were founded to supply computer applications to companies over the Internet. Most of the early ASPs failed, but their model of supplying applications remotely became popular a decade later, when it was renamed cloud computing.

Cloud services and major providers

Cloud computing encompasses a number of different services. One set of services, sometimes called software as a service (SaaS), involves the supply of a discrete application to outside users. The application can be geared either to business users (such as an accounting application) or to consumers (such as an application for storing and sharing personal photographs). Another set of services, variously called utility computing, grid computing, and hardware as a service (HaaS), involves the provision of computer processing and data storage to outside users, who are able to run their own applications and store their own data on the remote system. A third set of services, sometimes called platform as a service (PaaS), involves the supply of remote computing capacity along with a set of software-development tools for use by outside software programmers.

Early pioneers of cloud computing include Salesforce.com, which supplies a popular business application for managing sales and marketing efforts; Google, Inc., which in addition to its search engine supplies an array of applications, known as Google Apps, to consumers and businesses; and Amazon Web Services, a division of online retailer Amazon.com, which offers access to its computing system to Web-site developers and other companies and individuals. Cloud computing also underpins popular social networks and other online media sites such as Facebook, MySpace, and Twitter. Traditional software companies, including Microsoft Corporation, Apple Inc., Intuit Inc., and Oracle Corporation, have also introduced cloud applications.

Cloud-computing companies either charge users for their services, through subscriptions and usage fees, or provide free access to the services and charge companies for placing advertisements in the services. Because the profitability of cloud services tends to be much lower than the profitability of selling or licensing hardware components and software programs, it is viewed as a potential threat to the businesses of many traditional computing companies.

Are you a student?
Get a special academic rate on Britannica Premium.

Data centres and privacy

Construction of the large data centres that run cloud-computing services often requires investments of hundreds of millions of dollars. The centres typically contain thousands of server computers networked together into parallel-processing or grid-computing systems. The centres also often employ sophisticated virtualization technologies, which allow computer systems to be divided into many virtual machines that can be rented temporarily to customers. Because of their intensive use of electricity, the centres are often located near hydroelectric dams or other sources of cheap and plentiful electric power.

Because cloud computing involves the storage of often sensitive personal or commercial information in central database systems run by third parties, it raises concerns about data privacy and security as well as the transmission of data across national boundaries. It also stirs fears about the eventual creation of data monopolies or oligopolies. Some believe that cloud computing will, like other public utilities, come to be heavily regulated by governments.

Nicholas Carr

data analysis, the process of systematically collecting, cleaning, transforming, describing, modeling, and interpreting data, generally employing statistical techniques. Data analysis is an important part of both scientific research and business, where demand has grown in recent years for data-driven decision making. Data analysis techniques are used to gain useful insights from datasets, which can then be used to make operational decisions or guide future research. With the rise of “big data,” the storage of vast quantities of data in large databases and data warehouses, there is increasing need to apply data analysis techniques to generate insights about volumes of data too large to be manipulated by instruments of low information-processing capacity.

Data collection

Datasets are collections of information. Generally, data and datasets are themselves collected to help answer questions, make decisions, or otherwise inform reasoning. The rise of information technology has led to the generation of vast amounts of data of many kinds, such as text, pictures, videos, personal information, account data, and metadata, the last of which provide information about other data. It is common for apps and websites to collect data about how their products are used or about the people using their platforms. Consequently, there is vastly more data being collected today than at any other time in human history. A single business may track billions of interactions with millions of consumers at hundreds of locations with thousands of employees and any number of products. Analyzing that volume of data is generally only possible using specialized computational and statistical techniques.

The desire for businesses to make the best use of their data has led to the development of the field of business intelligence, which covers a variety of tools and techniques that allow businesses to perform data analysis on the information they collect.

Process

For data to be analyzed, it must first be collected and stored. Raw data must be processed into a format that can be used for analysis and be cleaned so that errors and inconsistencies are minimized. Data can be stored in many ways, but one of the most useful is in a database. A database is a collection of interrelated data organized so that certain records (collections of data related to a single entity) can be retrieved on the basis of various criteria. The most familiar kind of database is the relational database, which stores data in tables with rows that represent records (tuples) and columns that represent fields (attributes). A query is a command that retrieves a subset of the information in the database according to certain criteria. A query may retrieve only records that meet certain criteria, or it may join fields from records across multiple tables by use of a common field.

Frequently, data from many sources is collected into large archives of data called data warehouses. The process of moving data from its original sources (such as databases) to a centralized location (generally a data warehouse) is called ETL (which stands for extract, transform, and load).

  1. The extraction step occurs when you identify and copy or export the desired data from its source, such as by running a database query to retrieve the desired records.
  2. The transformation step is the process of cleaning the data so that they fit the analytical need for the data and the schema of the data warehouse. This may involve changing formats for certain fields, removing duplicate records, or renaming fields, among other processes.
  3. Finally, the clean data are loaded into the data warehouse, where they may join vast amounts of historical data and data from other sources.

After data are effectively collected and cleaned, they can be analyzed with a variety of techniques. Analysis often begins with descriptive and exploratory data analysis. Descriptive data analysis uses statistics to organize and summarize data, making it easier to understand the broad qualities of the dataset. Exploratory data analysis looks for insights into the data that may arise from descriptions of distribution, central tendency, or variability for a single data field. Further relationships between data may become apparent by examining two fields together. Visualizations may be employed during analysis, such as histograms (graphs in which the length of a bar indicates a quantity) or stem-and-leaf plots (which divide data into buckets, or “stems,” with individual data points serving as “leaves” on the stem).

Data analysis frequently goes beyond descriptive analysis to predictive analysis, making predictions about the future using predictive modeling techniques. Predictive modeling uses machine learning, regression analysis methods (which mathematically calculate the relationship between an independent variable and a dependent variable), and classification techniques to identify trends and relationships among variables. Predictive analysis may involve data mining, which is the process of discovering interesting or useful patterns in large volumes of information. Data mining often involves cluster analysis, which tries to find natural groupings within data, and anomaly detection, which detects instances in data that are unusual and stand out from other patterns. It may also look for rules within datasets, strong relationships among variables in the data.

Are you a student?
Get a special academic rate on Britannica Premium.
Stephen Eldridge