Finding fraud through textual analysis


Jenkins and Fan behind stacks of documents Greg Jenkins (left) and Patrick Fan are developing a computerized tool to handle the tedious, time-consuming, and, often, impossible task of manually examining all available text documents from a firm being audited.

Can textual analysis help auditors identify fraud?

Preliminary findings from research conducted by Patrick Fan and Greg Jenkins, associate professors of accounting and information systems in the Pamplin College of Business, suggest that the technique — used extensively in the social sciences to scrutinize written and oral communication — can be used to identify language patterns in management communications that are inconsistent with either the company's financial performance or with the communications of other companies in the same industry. Such inconsistencies may indicate fraud.

"The results of our initial analysis suggest that our model has substantial predictive power," Jenkins says. "When fraud is committed in companies, there appear to be patterns in corporate communications that imply wrongdoing."

The professors hope to develop their methodology, based on knowledge from auditing and information systems, into a more precise new tool to help auditors and regulators detect fraud. They have received a grant of about $196,000 from PricewaterhouseCoopers for their two-year project, expected to be completed in 2009.

    Greg Jenkins

Fraud detection, Jenkins says, is a top priority for the auditing profession, because of its "dramatic, negative effect on the public's confidence in capital markets" and its "staggering" costs. "Glass Lewis & Co. recently estimated that high-profile frauds resulted in the loss of almost $900 billion in market capitalization from 1997 to 2004," he says.

There continues to be an "expectations gap," however — between investors, regulators, and the media on one side, and the auditing profession on the other — regarding auditors' responsibilities for detecting fraud, says Jenkins, a former auditor at Ernst & Young.

Jenkins serves on a research task force that is providing guidance to the Public Company Accounting Oversight Board on matters related to audit firms' quality control.

"Investors and regulators would obviously prefer that auditors find all frauds," Jenkins says. "But, the current standards don't require them to do that — and, frankly, auditors don't have precise enough testing procedures to identify all frauds. An annual report, for example, contains tremendous amounts of information. The numbers of transactions that lead to a company's financial report are such that it is difficult to audit enough transactions to catch all frauds."

Jenkins adds that audit procedures aimed at fraud detection have remained largely unchanged in recent years. The methods used include examining accounts and records for anomalies, interviewing company employees, and reviewing the company's internal controls. Firms, he adds, are working to develop more sophisticated fraud detection techniques.

"Extensive testing is very expensive," Jenkins says. "While technology can be used to perform many audit procedures, much of the audit requires human effort."

He and Fan hope to aid that human effort by developing a computerized tool to handle the tedious, time-consuming, and often impossible task of manually examining all available text documents from the firm being audited.

    Patrick Fan

Fan, a specialist in data and text mining and business intelligence research, says their model uses a technique called text mining to automatically identify word patterns that might be highly associated with financial fraud.

"Once we build a very good mining tool, we can use it to screen all firms within an industry," Fan says.

By recognizing language patterns or trends that are inconsistent with either the company's financial performance or communications issued by other companies in the same industry, the software would guide auditors to particular areas — revenue recognition policies or practices or disclosures of liabilities, for example — that may need further examination.

Explaining the need to compare the company with others in the same industry, Jenkins says that in many instances of fraud, inconsistencies between a company's communications and its financial performance may be difficult to discern. In such cases, benchmarking a company's communication patterns against those of other companies in the same industry may help reveal unusual or unexpected differences.

"Companies in the same industry — with similar products, business lines, competitive regions, and, sometimes, customer bases — tend to describe their transactions in very similar terms," Jenkins says.

A company's financial performance may be similar to its competitors, Jenkins says, "yet the language it is using to describe its prospects seems overly optimistic or overly specific or vague relative to others in the industry." Enron's annual reports from the late 1990s, for example, exuded an "unlimited optimism" at a time when other companies were starting to struggle, he says.

"The way Enron described its prospects was inconsistent with how its competitors were describing their prospects," Jenkins says. "Moreover, its descriptions of related-party transactions were incomplete and overly vague."

Developing the benchmark data itself, Jenkins says, is a tremendous challenge. He and Fan have compiled a list of cases of known fraud — companies that have been sanctioned by the Securities and Exchange Commission — and are completing identification of another set of companies, those in the same industries "whose financial statements have stood the test of time."

The professors will use their methodology to compare large volumes of corporate communications — annual reports, letters to shareholders, and transcripts of analyst conference calls, for example — from these two groups of companies, which represent a variety of industries: technology, retail, energy, and consumer products.

"We're tracking tens of thousands of words from multiple companies and multiple periods," Jenkins says. "We're using computing power to go through and look at language to identify patterns — words and frequency of usage — that would be very difficult for a human reader to discern. Our findings so far show that there are systematic differences in textual communications between the two groups of companies."

He and Fan envisage their software serving as a decision-support tool that would improve the efficiency of the auditing process, help auditors gain additional sources of evidence, and, ultimately, enable detection of financial fraud.

The grant from PricewaterhouseCoopers, the professors say, will allow their research to be completed more rapidly. The firm launched its "PwC INQuires" program of funding for applied research in 2007 to assist faculty and doctoral students "seeking to increase the knowledge base that contributes to the practice of auditing and tax." In its inaugural year, the program awarded more than $580,000 to 37 researchers for 13 projects. The grant awarded to Jenkins and Fan represents a third of this total.

  • A version of this article originally appeared in the Spring 2008 issue of Pamplin magazine.
  • For more information on this topic, e-mail Sookhan Ho, or call (540) 231-5071.

Blowing the whistle

    Richard Wokutch

Management professor Richard Wokutch says the False Claims Act, which offers whistle blowers financial rewards for disclosing fraud against the federal government, has raised “interesting ethical issues.”

Related links

Spotlight Archive

Look through previous Spotlight stories

Access the archives