Skip to main content

Table 1 The dataset comprises data from 5046 female participants of Mexican descent. It includes demographic details such as age and medical history, including breast implant status, prior cancer history, and previous surgeries or biopsies. Furthermore, it contains information on family cancer history, breast density classifications (A, B, C, D), and BI-RADS assessments across five categories. The dataset also records biopsy recommendations and confirmed cancer cases, as well as the mean time to a cancer event

From: TECRR: a benchmark dataset of radiological reports for BI-RADS classification with machine learning, deep learning, and large language model baselines

Characteristics of dataset

 

No. of Examples

5046

Sex

Women(5046), 100%

Race

Mexican Population

Age (Mean, SD, Range)

53, 9.99, 25-90

Patients Implants

No: 4185, Yes: 861

History of Previous Cancer

No: 4525, Yes: 521

Previous Surgeries/Biopsies

No: 4160, Yes: 886

Family History of Cancer

No: 4296, Yes: 750

Breast Density Distribution

A: 641, B: 1271, C: 2053, D: 1064

Distribution of BIRADS

1: 117, 2: 3921, 3: 802, 4: 129, 5: 77

Biopsy Recommendation

No: 4796, Yes: 250

Patients with Confirmed Cancer

30

Cancer Development in Five Years

61

Mean Time to Cancer Event

Mean: 1000.008 Days, SD: 898.91