Big Data is a formidable tool in the fight against breast cancer. The growth of data mining in healthcare combined with sophisticated machine learning is poised to make advanced predictive analysis a game-changer in reducing risk, detecting disease earlier, and reducing mortality rates from breast cancer.

More than ever, data analysts stand on the front lines of cancer research, prediction, treatment, and survivability. A growing body of published research explores how data mining in healthcare vastly improves cancer research, prediction, and treatment protocols.

With an online Graduate Certificate in Data Analytics, professionals employ the methods of data analytics used to improve and even help save the lives of millions of women.

Breast Cancer Statistics

About one in eight women in the United States will develop breast cancer at some point in their lives.

The American Cancer Society estimates that there will be about 276,480 new cases of invasive breast cancer in 2020. Add to that another 48,530 new diagnoses of non-invasive “carcinoma in situ” (CIS), the earliest stage of breast cancer. Upwards of 42,170 women in the US will die from breast cancer in 2020.

Breast cancer is the second most common occurrence of cancer for women in the United States, right behind skin cancers. Globally, it is the most common.

A diagnosis of breast cancer is a traumatic discovery, but it doesn’t have to be devastating, and it certainly doesn’t have to be fatal. With improvements in predictive analysis and treatment, the fight against breast cancer can be won.

Surviving Breast Cancer: Data Mining in Healthcare

Breast cancer can’t be prevented. But by lowering your risk, it is eminently survivable.

An annual screening mammogram and a regime of self-examination, especially for women with genetic risk factors, can lead to early detection, vastly improving your odds of surviving breast cancer.

Put another way, the best way to survive breast cancer is statistical, putting the odds in your favor. Seeing it well before it arrives, determining optimal treatment when it does arrive. How is this possible? Big Data, machine learning, and sophisticated predictive analysis.

The 5 V of Big Data

A paper published by the National Center for Biotechnology Information specifies the “5 V” of Big Data: Volume, Velocity, Variety, Veracity, and Value of the data exploited. These are the characteristics of Big Data. The real power, argues the paper, is understanding how these attributes are applied to real-world problems.

“The 5 V,” state the authors, “are insufficient to characterize the essence of the innovation brought by Big Data. The mastery of these algorithms is at the heart of the business of data scientists.”

“Healthcare is one of the most important applications of Big Data,” write the authors of the January 2020  paper Analysis of Breast Cancer Dataset Using Big Data Algorithms for Accuracy of Diseases Prediction. “Diagnosis of diseases like cancer at an early stage is also very crucial.”

The research discussed in the paper focuses on the “prediction model analysis for the breast cancer diagnosis, either benign or malignant, at an early stage as it increases the chances for successful treatment… so predicting breast cancer at benign increases the survival rate of women.”

A May 2019 article in Science Daily talks of the improvements big data brings to breast cancer research.

Yet another paper, published in November 2019, says that “the prevailing prediction model is time-consuming and (has) less accuracy. To trounce those drawbacks, this paper proposes a breast cancer prediction system (BCPS) utilizing Optimized Artificial Neural Network (OANN).

If you’re eager to understand what this means and how you can get involved in the exciting field of predictive analysis to fight breast cancer and other diseases, the world needs you.

Earn a Graduate Certificate in Data Analytics

As data mining in healthcare and predictive analysis becomes an ever more potent tool in the fight against breast cancer, the demand for trained professionals with expertise in data analytics continues to grow.

The online Graduate Certificate in Data Analytics is designed for people who want to “make an impact in a data-driven world.” What better way to make that impact than to help reduce breast cancer mortality?

The program entails a curriculum of four courses. Each course builds on the others providing students the skills and comprehensive understanding of advanced predictive analysis. Courses and topic include:

Data Analysis

An introduction to the concepts and tools commonly used in statistical analysis.


Focus on the application of statistical tools for estimating economic and other relationships. Understand statistical models used in predictive analysis, including the linear regression model.

Big Data Econometrics

Learn the terminology, technology, and techniques that drive machine learning. Understand the applications of big data and machine learning in healthcare, economics, and many other areas.

Predictive Analytics/Forecasting

Examine the most popular forecasting methods in the industry. Learn time series data manipulation and feature creation. Work with transactional and hierarchical time series data. Consider procedures for evaluating forecasting models.


Students with a bachelor’s degree can enroll in the program. The Graduate Certificate in Data Analytics is built for recent grads ready to establish a new career or experienced professionals ready to take on new roles.

“Breast cancer lends itself particularly well to big data because the disease takes on so many forms,” says an article in “If we can bring patients together digitally and look at effective treatment options, we can do a lot to help them.” Fulfilling the promise that Big Data offers requires skilled and dedicated data analysts.

With a graduate certificate in data analytics from Boston College, you can join the ranks of those bringing that promise to fruition.