. 2020 Aug 13;2020(8):CD005552. doi: 10.1002/14651858.CD005552.pub3

Summary of findings 4. Metformin compared to oral contraceptive pill (OCP) for hirsutism, acne, and menstrual pattern in adolescent women with polycystic ovary syndrome (PCOS).

Metformin compared to OCP for hirsutism, acne, and menstrual pattern in adolescent women with PCOS
Patient or population: adolescent women with PCOS Setting: Hospital or University Clinics Intervention: metformin Comparison: OCP
Outcomes		*Anticipated absolute effects^ (95% CI)**		Relative effect (95% CI)	№ of participants (studies)	Quality of the evidence (GRADE)	Comments
Outcomes		Risk with OCP	Risk with metformin	Relative effect (95% CI)	№ of participants (studies)	Quality of the evidence (GRADE)	Comments
Hirsutism ‐ Clinical F‐G score		The mean hirsutism ‐ Clinical F‐G score was 8.6	MD 0.40 lower (3.42 lower to 2.62 higher)	‐	16 (1 RCT)	⊕⊝⊝⊝ VERY LOW ^1,2
Adverse event ‐ Severe	Gastro‐intestinal	No trials reported on outcome "Adverse event ‐ Severe ‐ Gastro‐intestinal"
Adverse event ‐ Severe	Others	150 per 1 000	100 per 1 000 (27 to 300)	OR 0.63 (0.16 to 2.43)	80 (1 RCT)	⊕⊝⊝⊝ VERY LOW ^3,4
Adverse event ‐ Minor	Gastro‐intestinal	0 per 1 000	3 per 1 000 (0 to 0)	OR 11.67 (0.53 to 258.56)	22 (1 RCT)	⊕⊝⊝⊝ VERY LOW ^1,5	There were only 3 events in the arm metformin and 0 in the arm OCP
Adverse event ‐ Minor	Others	No trials reported on outcome "Adverse event ‐ Minor ‐ Others"
Improved menstrual pattern	Shortening of inter menstrual days	No trials reported on outcome "Improved menstrual pattern (i.e. shortening of inter menstrual days)"
Improved menstrual pattern	An initiation of menses or cycle regularity	1 000 per 1 000	1000 per 1 000 (1 000 to 1 000)	OR 0.10 (0.01 to 1.92)	80 (1 RCT)	⊕⊝⊝⊝ VERY LOW ^3,6	40 out of 40 participants had improved menstrual patter in the OCP group compared to 36 out of 40 in the metformin group
Acne ‐ Visual analogue scale or Clinical acne score		No trials reported either on outcome "Acne ‐ Visual analogue scale" or "Acne ‐ Clinical acne score"
BMI (kg/m²)		The mean BMI (kg/m²) was 36	MD 1.45 lower (5.08 lower to 2.17 higher)	‐	69 (3 RCTs)	⊕⊝⊝⊝ VERY LOW ^7,8
*The risk in the intervention group (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).  BMI: body mass index; CI: Confidence interval; F‐G: Ferriman‐Gallwey score; MD: Mean difference; OR: Odds ratio; RCT: Randomised controlled trial.
GRADE Working Group grades of evidence High quality: further research is very unlikely to change our confidence in the estimate of effect. Moderate quality: further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate. Low quality: further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate. Very low quality: we are very uncertain about the estimate.

¹ Evidence downgraded by one level for serious risk of bias – a single RCT which has unclear risk of bias ² Evidence downgraded by two levels for very serious imprecision – very low number of participants (total number of participants < 400 i.e. n = 16 participants) and 95% CI includes both appreciable benefit and appreciable harm ³ Evidence downgraded by one level for serious risk of bias – a single RCT which has high risk of bias ⁴ Evidence downgraded by two levels for very serious imprecision – very low number of events (total number of events < 300 i.e. n = 10 events) and 95% CI includes both appreciable benefit and appreciable harm ⁵ Evidence downgraded by two levels for very serious imprecision – very low number of events (total number of events < 300 i.e. n = 3 events) and 95% CI includes both appreciable benefit and appreciable harm ⁶ Evidence downgraded by two levels for very serious imprecision – very low number of events (total number of events < 300 i.e. n = 76 events) and 95% CI includes both appreciable benefit and appreciable harm ⁷ Evidence downgraded by one level for serious risk of bias ‐ the majority of the RCTs have unclear risk of bias ⁸ Evidence downgraded by two levels for very serious imprecision – very low number of participants (total number of participants < 400 i.e. n = 69 participants) and 95% CI includes both appreciable benefit and appreciable harm