INTRODUCTION
Obstructive sleep apnea (OSA) is a condition in which breathing stops during sleep, which, if left untreated, can significantly reduce the quality of life by causing chronic fatigue, daytime sleepiness, and cognitive function decline [
1]. The prevalence of OSA in Korea is 4% of male and 2% of female, but it is known to increase to 30% in age-specific studies [
2]. OSA is potentially fatal disease because it can increase the risk of cardiovascular disease, including coronary artery disease, stroke and diabetes [
3-
5]. Considering the prevalence and health effects of OSA, proper diagnosis and treatment is essential. Currently, polysomnography (PSG) is the gold standard for diagnosing OSA. PSG has limitations in diagnosing or tracking large-scale OSA patients because it requires inhospital facilities and a lot of time and labor for examination and reading. Complementary to PSG, home sleep apnea test (HSAT) is less expensive than PSG, but it also requires a medical evaluation by a doctor [
6] and is still limited for many patients due to cost or time issues.
Recently, digital wearable devices such as, Smartwatches (SWs) or Bands that can easily measure various health bio-signals have become common. They are equipped with a 3-axis accelerometer that distinguishes between motion and rest, and photoplethysmography (PPG) that measures peripheral blood volume by detecting the difference of light absorption. The PPG can indirectly estimate oxygen saturation and respiration rate by calculating different light absorption rates according to heart rate or oxygenated hemoglobin levels, and replace the hassle of inserting a needle into a blood vessel to measure oxygen saturation in various medical fields.
Although most digital wearable devices are not Food and Drug Administration approved and made for well-being or fitness purposes, they measure critical vital signs such as, oxygen saturation, pulse rate, and respiration that cross medical and non-medical boundaries, which increase the need to evaluate and validate the accuracy of these devices. If these devices become more accurate, the paradigm may change from doctor-and hospital-centered healthcare to patient-centered smart healthcare. Digital healthcare can provide health care services of prevention, diagnosis, treatment, and follow-up care anytime and anywhere [
7].
Various digital wearable devices and apps related to sleep have been developed to analyze sleep patterns and quality. Early wearable devices analyzed sleep duration and structure based on a 3-axis accelerometer and pulse rate variability. However, compared with PSG, which is the gold standard, it was still insufficient to replace PSG due to its low specificity compared to the sensitivity except for total sleep time [
8-
10]. The recently introduced SWs can serially measure oxygen saturation during sleep using reflective PPG, so it theoretically possible to detect hypoxia, which is often accompanied by sleep-related breathing disorders. Although diagnosis of OSA syndrome (OSAS) is not made by the severity of oxygen desaturation, but by apnea or hypopnea episodes per hour, a few studies have suggested that the oxygen saturation-related index in patient with OSA is significantly associated with the apnea-hypopnea index (AHI), and oxygen-related parameters can be helpful in diagnosing and assessing the severity of OSAS [
11-
13].
Therefore, the purpose of this study is to compare oxygen saturation parameters of PPG-based SWs with the results of PSG to determine the diagnostic accuracy and how useful it tis to screen for OSA.
METHODS
Subjects
A prospective study was conducted with patients aged 19 years older who voluntarily participated in this study among patients who visited an Eulji University Hospital Sleep Clinic from September 1, 2021 to September 30, 2022, and underwent PSG with suspected OSA. Exclusion criteria are as follows. Patients younger than 19 years of age, patients who did not consent to the study, patients with pigmented or tattooed wrists, patients who unable to clearly communicate or respond to pain, and patients who slept less than 240 minutes.
All participants received informed consent in writing, and the study protocol was approved by the Institutional Review Board of Daejeon Eulji University Hospital (IRB no. 2022-03-010-002)
A PPG based SW
PPG is a non-invasive method for measuring blood volume in the microvascular layer of the skin based on optical properties such as, absorption and reflection of human tissues at specific light wavelengths. Deoxygenated hemoglobin absorbs more red wavelengths, and oxygenated hemoglobin absorbs more infrared wavelengths [
14].
The SWs measures the infrared PPG and red PPG signals from the wrist using a reflective pulse oximetry to obtain the perfusion index, which is the ratio of pulsatile and non-pulsatile static blood flow at each wavelength. Using the R value calculated using the ratio of the perfusion index and a pretrained model, the oxygen saturation is estimated.
An Android-based Samsung Galaxy watch4 (GW, 44 mm sized) and an iPhone-based Apple watch7 (AW, 44 mm sized) were used in the study. Participants were randomly assigned to wear GW or AW on both wrists, and a transmissive pulse oximeter was worn on their fingertips to perform a nighttime PSG test at a sleep clinic.
PSG and scoring rule
PSG was conducted in a level I environment according to the guidelines of the American Academy of Sleep Medicine (AASM) using Embla N7000 (Natus, Kanata, Canada). Electroencephalogram (F3, F4, C3, C4, O1, O2), electrooculogram, chin and leg electromyogram, airflow signals, respiratory effort signals, electrocardiogram, snoring, and peripheral oxygen saturation were measured. Reading was performed by a sleep medicine specialist according to the AASM Manual for the Scoring of Sleep and Associated Events. An AHI of less than 5 per hour is defined as the normal, 5 or more and less than 15 as mild, 15 or more and less than 30 as moderate, and 30 or more as severe.
Statistics
Statistical analysis was performed using IBM SPSS Statistics 25 (IBM Corp., Armonk, NY, USA). Bland-Altman plot and intraclass correlation coefficient (ICC) were used to evaluate the agreement between SWs and PSG. To compare the diagnostic performance of the two different SWs for OSA screening, a receiver-operating characteristic (ROC) curve was performed and the cut-off values were obtained. All statistical significance was considered when p<0.05.
DISCUSSION
Through this study, SWs, one of the user-friendly and popular wearable devices, was compared with PSG, the diagnostic gold standard of OSA, and the diagnostic performance and potential of SWs could be confirmed. In this study, in the case of GW, if the duration of the oxygen saturation time of less than 90% was more than 7 seconds or the lowest oxygen saturation level was less than 88.0%, it was possible to show approximately 78%–80% accuracy for estimating OSA.
AW showed approximately 68%–73% accuracy when diagnosing OSA with the average oxygen saturation was 95.7% or less or the lowest oxygen saturation was 93.0% or less.
Both GW and AW showed a significant degree of accuracy in diagnosing OSA with oxygen parameters, but the index of GW was more intuitive and accurate. In GW, if the oxygen saturation is lower than 90% regardless of time or lowest level, it can be easily seen that there is a problem with breathing during sleep. However, even the cut-off value of lowest oxygen saturation of AW is also more than 90%, so it is not easy for users to recognize their sleep disordered breathing by their oxygen saturation value. In addition, AW tends to overestimate the oxygen saturation by an average of 2%–10% compared to the PSG transmissive pulse oximeter, so there is a risk that general public may overestimate their real oxygen saturation during sleep.
The second significance of this study is that, unlike other studies that reanalyzed the data extracted from SWs [
15], the oxygen saturation data displayed on the SW was directly compared with the that of PSG. Up to now, wearable devices have not been able to show an accuracy comparable to PSG due to various noises or limitation of the devices itself [
8-
10]. To overcome this, it was indispensable for researchers to extract raw data from SW and reanalyze it as a 30-second epoch base to reduce noises or errors. However, this reanalyze method requires time and the help of experts, so it is not practical because it is not readily available to the general public. In that regard, the authors made it easy to communicate the results of this study to the general public by comparing the accuracy of the real-world data of SWs with PSG.
OSA is also becoming a chronic disease that need to be checked frequently as the population ages to reduce the risk of OSA complications such as hypertension, ischemic heart disease, type 2 diabetes, and stroke increase. However, PSG, known as the gold standard for diagnosing OSA, is not an appropriate tool for managing OSA patients, which is increasing in cost, time, and efficiency. To supplement this, the AASM announced that the HSAT could be used under physician review for patients with suspected OSA [
6]. However, it also states that ‘the HSAT should not be used for the general screening of asymptomatic populations’ [
6]. However, some OSA patients are asymptomatic, some are unaware of the severity of their symptoms, and many patients are still difficult to see a doctor. Furthermore, PSG is too expensive and cumbersome to keep track of if you are getting the right treatment.
Level III HSAT lacks electroencephalography detection and can measure airflow, respiratory effort, and blood oxygenation. Recently, SWs have a built-in actigraphy and oxygen saturation sensors PPG-based, which can indirectly estimate airflow and breathing rate and intensity, and they are evolving to the level comparable to the level IV HSAT [
15-
17].
Although AHI is the only criterion for diagnosing OSA and assessing its severity, but oxygen saturation can also be a useful parameter, as hypoxia is known to impair physiological and cognitive function [
13]. In supporting studies, there are reports that there is a correlation between oxygen desaturation index (ODI) and AHI, or that oxygen desaturation may better reflect the severity of OSA symptoms [
11,
12]. Using PSG and survey methods in 178 subjects, OSA severity based on AHI classification was correlated with various oxygen desaturation indices, and total and respiratory arousal indices were directly correlated with OSA severity [
11]. A study using PSG and multiple sleep latency test in 362 patients with suspected OSA reported that an increase in the severity of 10% oxygen desaturation might significantly increase OSA-related daytime sleepiness risk than an increase in 10% AHI [
12]. There is a study comparing ODI obtained with a single transmissive pulse oximeter with PSG and HSAT. The result of ODI by Single pulse oximeter had a significant correlation with RDI by HSAT and AHI by PSG. And the AUC of ROC for AHI ≥5 by PSG was reported to be 0.83 (confidence interval, 0.73–0.93), which could provide similar prediction of OSA [
18]. The AUC of ROC in the above study is similar to the our GW predictive power.
There are two types of pulse oximeter: transmissive and reflectance.
The transmissive type has relatively high accuracy because it is attached to a thin area such as a fingertip or an earlobe in a clip method and emits light from one side and detects the light from the other side. However, wearable devices mainly worn on the wrist, such as SW, are a reflectance type that emits light from one side and detects it from the same side, so if it does not adhere well to the interface of the device and the skin of the wrist, it may occur noises or errors. In addition, errors may occur depending on the thickness of the wrist skin, color, and development of subcutaneous blood vessels. Clip type transmissive pulse oximeter is accurate but it limits finger movement, causing discomfort and may make long-term use difficult. However, the wrist-type reflectance oximeter is easy to wear and has the advantage of being able to measure multiple times or continuously for a long time.
Recently, a study comparing the GW and PSG transmissive pulse oximeter was reported in Korea. From the raw data of GW, ODI was derived as the number of episode of oxygen desaturation divided by the total sleep time. The sensitivity, specificity, and accuracy of ODI ≥5/h of GW for OSA diagnosis were 89.7%, 64.1%, and 79.4%, similar to our results [
15]. In this study, it was confirmed that the ODI of GW had a strong positive correlation (r=0.918) with the AHI of PSG [
15]. Our study is different in that it directly compared the parameters displayed in SW with PSG rather than recalculated parameters.
Similar study have been reported in China comparing OSA diagnosis with Chinese SW and PSG or HSAT. 119 patients were included in the study, and when SW compared to the PSG, the OSA prediction (AHI ≥5/h) accuracy, sensitivity, and specificity were 81.1%, 76.5%, and 100%, respectively. In addition, when SW compared to the HSAT, moderate OSA prediction (AHI ≥15/h) accuracy, sensitivity, and specificity were 87.9%, 89.7%, and 86.0%, respectively [
16].
In our study, the diagnosis accuracy of GW using an oxygen saturation time of less than 90% was mild (80.7%), moderate (69.7%), and severe (63.3%) of OSA. As the severity of OSA increased, there was a large difference between the saturation value measured by PSG and that by GW, and the diagnostic accuracy decreased, showing a different patten from the study in China. In the Chinese study, indirect AHI was derived by estimating the PPG-based respiration rate using its own algorithm as well as the oxygen saturation. However, it did not describe in detail how to calculate the respiratory rate and how to distinguish between apnea and hypopnea.
Since this study mainly conducted at a sleep clinic that mainly examines OSA, the prevalence of OSA is very high, so it is difficult to say that it reflects the real world. It is necessary to study whether the false-positive rate in the OSA diagnosis of SW increases when a large number of normal subject are included, and whether this error can be overcome by repeated SW measurement for several nights.
Through this study, we were able to check the accuracy for diagnosis of OSA using SWs that anyone can use easily. Since digital health wearable devices such as SWs can easily measure and manage health-related bio-signals, it is expected that they will contribute to the diagnosis and evaluation of chronic sleep disorder such as OSA.