Convex Optimization - Quick Guide

Questo corso è utile per gli studenti che vogliono risolvere problemi di ottimizzazione non lineare che si presentano in varie applicazioni ingegneristiche e scientifiche. Questo corso inizia con la teoria di base della programmazione lineare e introdurrà i concetti di insiemi convessi e funzioni e le relative terminologie per spiegare i vari teoremi necessari per risolvere i problemi di programmazione non lineare. Questo corso introdurrà vari algoritmi utilizzati per risolvere tali problemi. Questo tipo di problemi sorgono in varie applicazioni tra cui l'apprendimento automatico, i problemi di ottimizzazione nell'ingegneria elettrica, ecc. Richiede che gli studenti abbiano una conoscenza preliminare dei concetti di matematica e del calcolo delle scuole superiori.

In questo corso, gli studenti impareranno a risolvere i problemi di ottimizzazione come $ min f \ left (x \ right) $ soggetti ad alcuni vincoli.

Questi problemi sono facilmente risolvibili se la funzione $ f \ left (x \ right) $ è una funzione lineare e se i vincoli sono lineari. Quindi si chiama problema di programmazione lineare (LPP). Ma se i vincoli non sono lineari, è difficile risolvere il problema di cui sopra. A meno che non possiamo tracciare le funzioni in un grafico, provare ad analizzare l'ottimizzazione può essere unidirezionale, ma non possiamo tracciare una funzione se è oltre le tre dimensioni. Da qui vengono le tecniche di programmazione non lineare o di programmazione convessa per risolvere tali problemi. In questi tutorial, ci concentreremo sull'apprendimento di tali tecniche e, alla fine, su alcuni algoritmi per risolvere tali problemi. per prima cosa porteremo la nozione di insiemi convessi che è alla base dei problemi di programmazione convessi. Poi con l'introduzione delle funzioni convesse, tratteremo alcuni importanti teoremi per risolvere questi problemi e alcuni algoritmi basati su questi teoremi.

Terminologie

Lo spazio $ \ mathbb {R} ^ n $ - È un vettore n-dimensionale con numeri reali, definito come segue - $ \ mathbb {R} ^ n = \ left \ {\ left (x_1, x_2, ... , x_n \ right) ^ {\ tau}: x_1, x_2, ...., x_n \ in \ mathbb {R} \ right \} $
Lo spazio $ \ mathbb {R} ^ {mXn} $ - È un insieme di tutte le matrici di valori reali dell'ordine $ mXn $.

Metodologia

La programmazione lineare, chiamata anche ottimizzazione lineare, è una tecnica utilizzata per risolvere problemi matematici in cui le relazioni sono di natura lineare. la natura fondamentale della programmazione lineare è massimizzare o minimizzare un fileobjective function con soggetto ad alcuni constraints. La funzione obiettivo è una funzione lineare che si ottiene dal modello matematico del problema. I vincoli sono le condizioni che vengono imposte al modello e sono anche lineari.

Dalla domanda data, trova la funzione obiettivo.
trova i vincoli.
Disegna i vincoli su un grafico.
trova la regione ammissibile, che è formata dall'intersezione di tutti i vincoli.
trova i vertici della regione ammissibile.
trova il valore della funzione obiettivo in questi vertici.
Il vertice che massimizza o minimizza la funzione obiettivo (secondo la domanda) è la risposta.

Esempi

Step 1 - Massimizza $ 5x + 3y $ soggetto a

$ x + y \ leq 2 $,

$ 3x + y \ leq 3 $,

$ x \ geq 0 \: e \: y \ geq 0 $

Solution -

Il primo passo è trovare la regione ammissibile su un grafico.

Chiaramente dal grafo, i vertici della regione ammissibile sono

$ \ left (0, 0 \ right) \ left (0, 2 \ right) \ left (1, 0 \ right) \ left (\ frac {1} {2}, \ frac {3} {2} \ right ) $

Sia $ f \ sinistra (x, y \ destra) = 5x + 3y $

Mettendo questi valori nella funzione obiettivo, otteniamo:

$ f \ sinistra (0, 0 \ destra) $ = 0

$ f \ sinistra (0, 2 \ destra) $ = 6

$ f \ sinistra (1, 0 \ destra) $ = 5

$ f \ left (\ frac {1} {2}, \ frac {3} {2} \ right) $ = 7

Pertanto, la funzione massimizza a $ \ left (\ frac {1} {2}, \ frac {3} {2} \ right) $

Step 2- Una società di orologi produce un orologio digitale e uno meccanico. Le proiezioni a lungo termine indicano una domanda prevista di almeno 100 orologi digitali e 80 orologi meccanici ogni giorno. A causa delle limitazioni alla capacità di produzione, è possibile realizzare giornalmente non più di 200 orologi digitali e 170 meccanici. Per soddisfare un contratto di spedizione, ogni giorno viene spedito un totale di almeno 200 orologi.

Se ogni orologio digitale venduto si traduce in una perdita di $ \ $ 2 $, ma ogni orologio meccanico produce un profitto di $ \ $ 5 $, quanti di ciascun tipo dovrebbero essere effettuati ogni giorno per massimizzare i profitti netti?

Solution -

Sia $ x $ il numero di orologi digitali prodotti

$ y $ essere il numero di orologi meccanici prodotti

Secondo la domanda, ogni giorno devono essere realizzati almeno 100 orologi digitali e possono essere realizzati al massimo 200 orologi digitali.

$ \ Rightarrow 100 \ leq \: x \ leq 200 $

Allo stesso modo, ogni giorno devono essere realizzati almeno 80 orologi meccanici e possono essere realizzati al massimo 170 orologi meccanici.

$ \ Rightarrow 80 \ leq \: y \ leq 170 $

Poiché ogni giorno devono essere prodotti almeno 200 orologi.

$ \ Freccia destra x + y \ leq 200 $

Poiché ogni orologio digitale venduto comporta una perdita di $ \ $ 2 $, ma ogni orologio meccanico produce un profitto di $ \ $ 5 $,

Il profitto totale può essere calcolato come

$ Profitto = -2x + 5y $

E dobbiamo massimizzare il profitto, Pertanto, la domanda può essere formulata come:

Massimizza $ -2x + 5y $ soggetto a

$ 100 \: \ leq x \: \ leq 200 $

$ 80 \: \ leq y \: \ leq 170 $

$ x + y \: \ leq 200 $

Tracciando le equazioni di cui sopra in un grafico, otteniamo,

I vertici della regione ammissibile sono

$ \ sinistra (100, 170 \ destra) \ sinistra (200, 170 \ destra) \ sinistra (200, 180 \ destra) \ sinistra (120, 80 \ destra) e \ sinistra (100, 100 \ destra) $

Il valore massimo della funzione obiettivo si ottiene a $ \ sinistra (100, 170 \ destra) $ Pertanto, per massimizzare i profitti netti, è necessario produrre 100 unità di orologi digitali e 170 unità di orologi meccanici.

Una norma è una funzione che fornisce un valore strettamente positivo a un vettore o una variabile.

Norm è una funzione $ f: \ mathbb {R} ^ n \ rightarrow \ mathbb {R} $

Le caratteristiche di base di una norma sono:

Sia $ X $ un vettore tale che $ X \ in \ mathbb {R} ^ n $

$ \ sinistra \ | x \ right \ | \ geq 0 $
$ \ sinistra \ | x \ right \ | = 0 \ Leftrightarrow x = 0 \ forall x \ in X $
$ \ sinistra \ | \ alpha x \ destra \ | = \ sinistra | \ alpha \ destra | \ sinistra \ | x \ right \ | \ forall \: x \ in X e \: \ alpha \: è \: a \: scalare $
$ \ sinistra \ | x + y \ destra \ | \ leq \ sinistra \ | x \ destra \ | + \ sinistra \ | y \ destra \ | \ forall x, y \ in X $
$ \ sinistra \ | xy \ destra \ | \ geq \ sinistra \ | \ sinistra \ | x \ destra \ | - \ sinistra \ | y \ destra \ | \ right \ | $

Per definizione, la norma è calcolata come segue:

$ \ sinistra \ | x \ right \ | _1 = \ displaystyle \ sum \ limits_ {i = 1} ^ n \ left | x_i \ right | $
$ \ sinistra \ | x \ right \ | _2 = \ left (\ displaystyle \ sum \ limits_ {i = 1} ^ n \ left | x_i \ right | ^ 2 \ right) ^ {\ frac {1} {2}} $
$ \ sinistra \ | x \ right \ | _p = \ left (\ displaystyle \ sum \ limits_ {i = 1} ^ n \ left | x_i \ right | ^ p \ right) ^ {\ frac {1} {p}}, 1 \ leq p \ leq \ infty $

La norma è una funzione continua.

Prova

Per definizione, se $ x_n \ rightarrow x $ in $ X \ Rightarrow f \ left (x_n \ right) \ rightarrow f \ left (x \ right) $ allora $ f \ left (x \ right) $ è una funzione costante.

Sia $ f \ sinistra (x \ destra) = \ sinistra \ | x \ destra \ | $

Poiché $ x_n \ rightarrow x $ quindi, $ \ left \ | x_n-x \ right \ | \ rightarrow 0 $

Quindi $ \ left | f \ sinistra (x_n \ destra) -f \ sinistra (x \ destra) \ destra | \ leq 0 \ Rightarrow \ sinistra | f \ sinistra (x_n \ destra) -f \ sinistra (x \ destra) \ destra | = 0 \ Freccia destra f \ sinistra (x_n \ destra) \ freccia destra f \ sinistra (x \ destra) $

Quindi, la norma è una funzione continua.

Il prodotto interno è una funzione che fornisce uno scalare a una coppia di vettori.

Prodotto interno - $ f: \ mathbb {R} ^ n \ times \ mathbb {R} ^ n \ rightarrow \ kappa $ dove $ \ kappa $ è uno scalare.

Le caratteristiche di base del prodotto interno sono le seguenti:

Sia $ X \ in \ mathbb {R} ^ n $

$ \ left \ langle x, x \ right \ rangle \ geq 0, \ forall x \ in X $
$ \ left \ langle x, x \ right \ rangle = 0 \ Leftrightarrow x = 0, \ forall x \ in X $
$ \ left \ langle \ alpha x, y \ right \ rangle = \ alpha \ left \ langle x, y \ right \ rangle, \ forall \ alpha \ in \ kappa \: e \: \ forall x, y \ in X $
$ \ left \ langle x + y, z \ right \ rangle = \ left \ langle x, z \ right \ rangle + \ left \ langle y, z \ right \ rangle, \ forall x, y, z \ in X $
$ \ left \ langle \ overline {y, x} \ right \ rangle = \ left (x, y \ right), \ forall x, y \ in X $

Note -

Relazione tra norma e prodotto interno: $ \ left \ | x \ right \ | = \ sqrt {\ left (x, x \ right)} $
$ \ forall x, y \ in \ mathbb {R} ^ n, \ left \ langle x, y \ right \ rangle = x_1y_1 + x_2y_2 + ... + x_ny_n $

Esempi

1. trova il prodotto interno di $ x = \ left (1,2,1 \ right) \: e \: y = \ left (3, -1,3 \ right) $

Soluzione

$ \ sinistra \ langle x, y \ right \ rangle = x_1y_1 + x_2y_2 + x_3y_3 $

$ \ left \ langle x, y \ right \ rangle = \ left (1 \ times3 \ right) + \ left (2 \ times-1 \ right) + \ left (1 \ times3 \ right) $

$ \ sinistra \ langle x, y \ destra \ rangle = 3 + \ sinistra (-2 \ destra) + 3 $

$ \ sinistra \ langle x, y \ right \ rangle = 4 $

2. Se $ x = \ sinistra (4,9,1 \ destra), y = \ sinistra (-3,5,1 \ destra) $ e $ z = \ sinistra (2,4,1 \ destra) $, trova $ \ sinistra (x + y, z \ destra) $

Soluzione

Come sappiamo, $ \ left \ langle x + y, z \ right \ rangle = \ left \ langle x, z \ right \ rangle + \ left \ langle y, z \ right \ rangle $

$ \ left \ langle x + y, z \ right \ rangle = \ left (x_1z_1 + x_2z_2 + x_3z_3 \ right) + \ left (y_1z_1 + y_2z_2 + y_3z_3 \ right) $

$ \ left \ langle x + y, z \ right \ rangle = \ left \ {\ left (4 \ times 2 \ right) + \ left (9 \ times 4 \ right) + \ left (1 \ times1 \ right) \ right \} + $

$ \ sinistra \ {\ sinistra (-3 \ times2 \ destra) + \ sinistra (5 \ times4 \ destra) + \ sinistra (1 \ volte 1 \ destra) \ destra \} $

$ \ sinistra \ langle x + y, z \ destra \ rangle = \ sinistra (8 + 36 + 1 \ destra) + \ sinistra (-6 + 20 + 1 \ destra) $

$ \ sinistra \ langle x + y, z \ right \ rangle = 45 + 15 $

$ \ sinistra \ langle x + y, z \ right \ rangle = 60 $

Minima locale o Riduci a icona

$ \ bar {x} \ in \: S $ si dice che sono i minimi locali di una funzione $ f $ se $ f \ left (\ bar {x} \ right) \ leq f \ left (x \ right), \ forall x \ in N_ \ varepsilon \ left (\ bar {x} \ right) $ dove $ N_ \ varepsilon \ left (\ bar {x} \ right) $ significa quartiere di $ \ bar {x} $, cioè $ N_ \ varepsilon \ left (\ bar {x} \ right) $ significa $ \ left \ | x- \ bar {x} \ right \ | <\ varepsilon $

Maxima locale o Maximizer

$ \ bar {x} \ in \: S $ si dice che sia il massimo locale di una funzione $ f $ se $ f \ left (\ bar {x} \ right) \ geq f \ left (x \ right), \ forall x \ in N_ \ varepsilon \ left (\ bar {x} \ right) $ dove $ N_ \ varepsilon \ left (\ bar {x} \ right) $ significa quartiere di $ \ bar {x} $, cioè $ N_ \ varepsilon \ left (\ bar {x} \ right) $ significa $ \ left \ | x- \ bar {x} \ right \ | <\ varepsilon $

Minimi globali

$ \ bar {x} \ in \: S $ si dice che siano i minimi globali di una funzione $ f $ se $ f \ left (\ bar {x} \ right) \ leq f \ left (x \ right), \ forall x \ in S $

Massimi globali

$ \ bar {x} \ in \: S $ si dice che sia il massimo globale di una funzione $ f $ se $ f \ left (\ bar {x} \ right) \ geq f \ left (x \ right), \ forall x \ in S $

Esempi

Step 1- trova i minimi e i massimi locali di $ f \ left (\ bar {x} \ right) = \ left | x ^ 2-4 \ destra | $

Solution -

Dal grafico della funzione sopra, è chiaro che i minimi locali si verificano a $ x = \ pm 2 $ e i massimi locali a $ x = 0 $

Step 2- trova i minimi globali della funzione $ f \ left (x \ right) = \ left | 4x ^ 3-3x ^ 2 + 7 \ destra | $

Solution -

Dal grafico della funzione sopra, è chiaro che i minimi globali si verificano a $ x = -1 $.

Sia $ S \ subseteq \ mathbb {R} ^ n $ Un insieme S si dice convesso se il segmento di linea che unisce due punti qualsiasi dell'insieme S appartiene anche a S, cioè se $ x_1, x_2 \ in S $ , quindi $ \ lambda x_1 + \ left (1- \ lambda \ right) x_2 \ in S $ dove $ \ lambda \ in \ left (0,1 \ right) $.

Note -

L'unione di due insiemi convessi può o non può essere convessa.
L'intersezione di due insiemi convessi è sempre convessa.

Proof

Siano $ S_1 $ e $ S_2 $ due insiemi convessi.

Sia $ S_3 = S_1 \ cap S_2 $

Siano $ x_1, x_2 \ in S_3 $

Poiché $ S_3 = S_1 \ cap S_2 $ quindi $ x_1, x_2 \ in S_1 $ e $ x_1, x_2 \ in S_2 $

Poiché $ S_i $ è un insieme convesso, $ \ forall $ $ i \ in 1,2, $

Quindi $ \ lambda x_1 + \ left (1- \ lambda \ right) x_2 \ in S_i $ dove $ \ lambda \ in \ left (0,1 \ right) $

Quindi, $ \ lambda x_1 + \ left (1- \ lambda \ right) x_2 \ in S_1 \ cap S_2 $

$ \ Rightarrow \ lambda x_1 + \ left (1- \ lambda \ right) x_2 \ in S_3 $

Quindi, $ S_3 $ è un insieme convesso.

Media ponderata della forma $ \ displaystyle \ sum \ limits_ {i = 1} ^ k \ lambda_ix_i $, dove $ \ displaystyle \ sum \ limits_ {i = 1} ^ k \ lambda_i = 1 $ e $ \ lambda_i \ geq 0 , \ forall i \ in \ left [1, k \ right] $ è chiamata combinazione conica di $ x_1, x_2, .... x_k. $
Media ponderata della forma $ \ displaystyle \ sum \ limits_ {i = 1} ^ k \ lambda_ix_i $, dove $ \ displaystyle \ sum \ limits_ {i = 1} ^ k \ lambda_i = 1 $ è chiamata combinazione affine di $ x_1 , x_2, .... x_k. $
La media ponderata della forma $ \ displaystyle \ sum \ limits_ {i = 1} ^ k \ lambda_ix_i $ è chiamata combinazione lineare di $ x_1, x_2, .... x_k. $

Esempi

Step 1 - Dimostra che l'insieme $ S = \ left \ {x \ in \ mathbb {R} ^ n: Cx \ leq \ alpha \ right \} $ è un insieme convesso.

Soluzione

Siano $ x_1 $ e $ x_2 \ in S $

$ \ Rightarrow Cx_1 \ leq \ alpha $ e $ \: e \: Cx_2 \ leq \ alpha $

Per mostrare: $ \: \: y = \ left (\ lambda x_1 + \ left (1- \ lambda \ right) x_2 \ right) \ in S \: \ forall \: \ lambda \ in \ left (0,1 \ a destra) $

$ Cy = C \ sinistra (\ lambda x_1 + \ sinistra (1- \ lambda \ destra) x_2 \ destra) = \ lambda Cx_1 + \ sinistra (1- \ lambda \ destra) Cx_2 $

$ \ Rightarrow Cy \ leq \ lambda \ alpha + \ left (1- \ lambda \ right) \ alpha $

$ \ Freccia destra Cy \ leq \ alpha $

$ \ Freccia destra y \ in S $

Pertanto, $ S $ è un insieme convesso.

Step 2 - Dimostra che l'insieme $ S = \ left \ {\ left (x_1, x_2 \ right) \ in \ mathbb {R} ^ 2: x_ {1} ^ {2} \ leq 8x_2 \ right \} $ è convesso impostato.

Soluzione

Siano $ x, y \ in S $

Siano $ x = \ left (x_1, x_2 \ right) $ e $ y = \ left (y_1, y_2 \ right) $

$ \ Rightarrow x_ {1} ^ {2} \ leq 8x_2 $ e $ y_ {1} ^ {2} \ leq 8y_2 $

Per mostrare - $ \ lambda x + \ left (1- \ lambda \ right) y \ in S \ Rightarrow \ lambda \ left (x_1, x_2 \ right) + \ left (1- \ lambda \ right) \ left (y_1, y_2 \ destra) \ in S \ Freccia destra \ sinistra [\ lambda x_1 + \ sinistra (1- \ lambda) y_2] \ in S \ destra) \ destra] $

$ Ora, \ left [\ lambda x_1 + \ left (1- \ lambda \ right) y_1 \ right] ^ {2} = \ lambda ^ 2x_ {1} ^ {2} + \ left (1- \ lambda \ right) ^ 2y_ {1} ^ {2} +2 \ lambda \ left (1- \ lambda \ right) x_1y_1 $

Ma $ 2x_1y_1 \ leq x_ {1} ^ {2} + y_ {1} ^ {2} $

Perciò,

$ \ left [\ lambda x_1 + \ left (1- \ lambda \ right) y_1 \ right] ^ {2} \ leq \ lambda ^ 2x_ {1} ^ {2} + \ left (1- \ lambda \ right) ^ 2y_ {1} ^ {2} +2 \ lambda \ left (1- \ lambda \ right) \ left (x_ {1} ^ {2} + y_ {1} ^ {2} \ right) $

$ \ Rightarrow \ left [\ lambda x_1 + \ left (1- \ lambda \ right) y_1 \ right] ^ {2} \ leq \ lambda x_ {1} ^ {2} + \ left (1- \ lambda \ right) y_ {1} ^ {2} $

$ \ Freccia destra \ sinistra [\ lambda x_1 + \ sinistra (1- \ lambda \ destra) y_1 \ destra] ^ {2} \ leq 8 \ lambda x_2 + 8 \ sinistra (1- \ lambda \ destra) y_2 $

$ \ Rightarrow \ left [\ lambda x_1 + \ left (1- \ lambda \ right) y_1 \ right] ^ {2} \ leq 8 \ left [\ lambda x_2 + \ left (1- \ lambda \ right) y_2 \ right] $

$ \ Rightarrow \ lambda x + \ left (1- \ lambda \ right) y \ in S $

Step 3 - Mostra che un insieme $ S \ in \ mathbb {R} ^ n $ è convesso se e solo se per ogni intero k, ogni combinazione convessa di qualsiasi k punti di $ S $ è in $ S $.

Soluzione

Sia $ S $ un insieme convesso. poi, per mostrare;

$ c_1x_1 + c_2x_2 + ..... + c_kx_k \ in S, \ displaystyle \ sum \ limits_ {1} ^ k c_i = 1, c_i \ geq 0, \ forall i \ in 1,2, ...., k $

Prova per induzione

Per $ k = 1, x_1 \ in S, c_1 = 1 \ Freccia destra c_1x_1 \ in S $

Per $ k = 2, x_1, x_2 \ in S, c_1 + c_2 = 1 $ e Poiché S è un insieme convesso

$ \ Freccia destra c_1x_1 + c_2x_2 \ in S. $

Lascia che la combinazione convessa di m punti di S sia in S cioè,

$ c_1x_1 + c_2x_2 + ... + c_mx_m \ in S, \ displaystyle \ sum \ limits_ {1} ^ m c_i = 1, c_i \ geq 0, \ forall i \ in 1,2, ..., m $

Ora, Sia $ x_1, x_2 ...., x_m, x_ {m + 1} \ in S $

Sia $ x = \ mu_1x_1 + \ mu_2x_2 + ... + \ mu_mx_m + \ mu_ {m + 1} x_ {m + 1} $

Sia $ x = \ left (\ mu_1 + \ mu_2 + ... + \ mu_m \ right) \ frac {\ mu_1x_1 + \ mu_2x_2 + \ mu_mx_m} {\ mu_1 + \ mu_2 + ......... + \ mu_m} + \ mu_ {m + 1} x_ {m + 1} $

Sia $ y = \ frac {\ mu_1x_1 + \ mu_2x_2 + ... + \ mu_mx_m} {\ mu_1 + \ mu_2 + ......... + \ mu_m} $

$ \ Freccia destra x = \ sinistra (\ mu_1 + \ mu_2 + ... + \ mu_m \ destra) y + \ mu_ {m + 1} x_ {m + 1} $

Ora $ y \ in S $ perché la somma dei coefficienti è 1.

$ \ Freccia destra x \ in S $ poiché S è un insieme convesso e $ y, x_ {m + 1} \ in S $

Quindi dimostrato per induzione.

Un insieme $ A $ si dice che sia un insieme affine se per due punti distinti qualsiasi, la linea che passa per questi punti giace nell'insieme $ A $.

Note -

$ S $ è un insieme affine se e solo se contiene ogni combinazione affine dei suoi punti.
Gli insiemi vuoti e singleton sono sia affini che convessi.

Ad esempio, la soluzione di un'equazione lineare è un insieme affine.

Prova

Sia S la soluzione di un'equazione lineare.

Per definizione, $ S = \ left \ {x \ in \ mathbb {R} ^ n: Ax = b \ right \} $

Siano $ x_1, x_2 \ in S \ Rightarrow Ax_1 = b $ e $ Ax_2 = b $

Per provare: $ A \ left [\ theta x_1 + \ left (1- \ theta \ right) x_2 \ right] = b, \ forall \ theta \ in \ left (0,1 \ right) $

$ A \ left [\ theta x_1 + \ left (1- \ theta \ right) x_2 \ right] = \ theta Ax_1 + \ left (1- \ theta \ right) Ax_2 = \ theta b + \ left (1- \ theta \ right ) b = b $

Quindi S è un insieme affine.

Teorema

Se $ C $ è un insieme affine e $ x_0 \ in C $, allora l'insieme $ V = C-x_0 = \ left \ {x-x_0: x \ in C \ right \} $ è un sottospazio di C.

Prova

Siano $ x_1, x_2 \ in V $

Per mostrare: $ \ alpha x_1 + \ beta x_2 \ in V $ per $ \ alpha, \ beta $

Ora, $ x_1 + x_0 \ in C $ e $ x_2 + x_0 \ in C $ per definizione di V

Ora $ \ alpha x_1 + \ beta x_2 + x_0 = \ alpha \ left (x_1 + x_0 \ right) + \ beta \ left (x_2 + x_0 \ right) + \ left (1- \ alpha - \ beta \ right) x_0 $

Ma $ \ alpha \ left (x_1 + x_0 \ right) + \ beta \ left (x_2 + x_0 \ right) + \ left (1- \ alpha - \ beta \ right) x_0 \ in C $ perché C è un insieme affine .

Pertanto, $ \ alpha x_1 + \ beta x_2 \ in V $

Quindi dimostrato.

Lo scafo convesso di un insieme di punti in S è il confine della più piccola regione convessa che contiene tutti i punti di S al suo interno o sul suo confine.

Sia $ S \ subseteq \ mathbb {R} ^ n $ Lo scafo convesso di S, indicato con $ Co \ left (S \ right) $ by è la raccolta di tutte le combinazioni convesse di S, cioè $ x \ in Co \ left (S \ right) $ se e solo se $ x \ in \ displaystyle \ sum \ limits_ {i = 1} ^ n \ lambda_ix_i $, dove $ \ displaystyle \ sum \ limits_ {1} ^ n \ lambda_i = 1 $ e $ \ lambda_i \ geq 0 \ forall x_i \ in S $

Remark - Conve lo scafo di un insieme di punti in S nel piano definisce un poligono convesso e i punti di S sul confine del poligono definisce i vertici del poligono.

Theorem $ Co \ left (S \ right) = \ left \ {x: x = \ displaystyle \ sum \ limits_ {i = 1} ^ n \ lambda_ix_i, x_i \ in S, \ displaystyle \ sum \ limits_ {i = 1} ^ n \ lambda_i = 1, \ lambda_i \ geq 0 \ right \} $ Mostra che uno scafo convesso è un insieme convesso.

Prova

Siano $ x_1, x_2 \ in Co \ left (S \ right) $, quindi $ x_1 = \ displaystyle \ sum \ limits_ {i = 1} ^ n \ lambda_ix_i $ e $ x_2 = \ displaystyle \ sum \ limits_ {i = 1} ^ n \ lambda_ \ gamma x_i $ dove $ \ displaystyle \ sum \ limits_ {i = 1} ^ n \ lambda_i = 1, \ lambda_i \ geq 0 $ e $ \ displaystyle \ sum \ limits_ {i = 1} ^ n \ gamma_i = 1, \ gamma_i \ geq0 $

Per $ \ theta \ in \ left (0,1 \ right), \ theta x_1 + \ left (1- \ theta \ right) x_2 = \ theta \ displaystyle \ sum \ limits_ {i = 1} ^ n \ lambda_ix_i + \ left (1- \ theta \ right) \ displaystyle \ sum \ limits_ {i = 1} ^ n \ gamma_ix_i $

$ \ theta x_1 + \ left (1- \ theta \ right) x_2 = \ displaystyle \ sum \ limits_ {i = 1} ^ n \ lambda_i \ theta x_i + \ displaystyle \ sum \ limits_ {i = 1} ^ n \ gamma_i \ sinistra (1- \ theta \ destra) x_i $

$ \ theta x_1 + \ left (1- \ theta \ right) x_2 = \ displaystyle \ sum \ limits_ {i = 1} ^ n \ left [\ lambda_i \ theta + \ gamma_i \ left (1- \ theta \ right) \ destra] x_i $

Considerando i coefficienti,

$ \ displaystyle \ sum \ limits_ {i = 1} ^ n \ left [\ lambda_i \ theta + \ gamma_i \ left (1- \ theta \ right) \ right] = \ theta \ displaystyle \ sum \ limits_ {i = 1 } ^ n \ lambda_i + \ left (1- \ theta \ right) \ displaystyle \ sum \ limits_ {i = 1} ^ n \ gamma_i = \ theta + \ left (1- \ theta \ right) = 1 $

Quindi, $ \ theta x_1 + \ left (1- \ theta \ right) x_2 \ in Co \ left (S \ right) $

Quindi, uno scafo convesso è un insieme convesso.

Sia S un insieme arbitrario in $ \ mathbb {R} ^ n $. Se $ x \ in Co \ left (S \ right) $, allora $ x \ in Co \ left (x_1, x_2, ...., x_n, x_ {n + 1} \ right) $.

Prova

Poiché $ x \ in Co \ left (S \ right) $, allora $ x $ è rappresentato da una combinazione convessa di un numero finito di punti in S, cioè,

$ x = \ displaystyle \ sum \ limits_ {j = 1} ^ k \ lambda_jx_j, \ displaystyle \ sum \ limits_ {j = 1} ^ k \ lambda_j = 1, \ lambda_j \ geq 0 $ e $ x_j \ in S, \ forall j \ in \ left (1, k \ right) $

Se $ k \ leq n + 1 $, il risultato ottenuto è ovviamente vero.

Se $ k \ geq n + 1 $, allora $ \ left (x_2-x_1 \ right) \ left (x_3-x_1 \ right), ....., \ left (x_k-x_1 \ right) $ dipendono linearmente .

$ \ Rightarrow \ esiste \ mu _j \ in \ mathbb {R}, 2 \ leq j \ leq k $ (non tutto zero) in modo tale che $ \ displaystyle \ sum \ limits_ {j = 2} ^ k \ mu _j \ sinistra (x_j-x_1 \ destra) = 0 $

Definisci $ \ mu_1 = - \ displaystyle \ sum \ limits_ {j = 2} ^ k \ mu _j $, quindi $ \ displaystyle \ sum \ limits_ {j = 1} ^ k \ mu_j x_j = 0, \ displaystyle \ sum \ limiti_ {j = 1} ^ k \ mu_j = 0 $

dove non tutti i $ \ mu_j $ sono uguali a zero. Poiché $ \ displaystyle \ sum \ limits_ {j = 1} ^ k \ mu_j = 0 $, almeno uno dei $ \ mu_j> 0,1 \ leq j \ leq k $

Quindi, $ x = \ displaystyle \ sum \ limits_ {1} ^ k \ lambda_j x_j + 0 $

$ x = \ displaystyle \ sum \ limits_ {1} ^ k \ lambda_j x_j- \ alpha \ displaystyle \ sum \ limits_ {1} ^ k \ mu_j x_j $

$ x = \ displaystyle \ sum \ limits_ {1} ^ k \ left (\ lambda_j- \ alpha \ mu_j \ right) x_j $

Scegli $ \ alpha $ in modo tale che $ \ alpha = min \ left \ {\ frac {\ lambda_j} {\ mu_j}, \ mu_j \ geq 0 \ right \} = \ frac {\ lambda_j} {\ mu _j}, $ per qualche $ i = 1,2, ..., k $

Se $ \ mu_j \ leq 0, \ lambda_j- \ alpha \ mu_j \ geq 0 $

Se $ \ mu_j> 0, allora \: \ frac {\ lambda _j} {\ mu_j} \ geq \ frac {\ lambda_i} {\ mu _i} = \ alpha \ Rightarrow \ lambda_j- \ alpha \ mu_j \ geq 0, j = 1,2, ... k $

In particolare, $ \ lambda_i- \ alpha \ mu_i = 0 $, per definizione di $ \ alpha $

$ x = \ displaystyle \ sum \ limits_ {j = 1} ^ k \ left (\ lambda_j- \ alpha \ mu_j \ right) x_j $, dove

$ \ lambda_j- \ alpha \ mu_j \ geq0 $ e $ \ displaystyle \ sum \ limits_ {j = 1} ^ k \ left (\ lambda_j- \ alpha \ mu_j \ right) = 1 $ e $ \ lambda_i- \ alpha \ mu_i = 0 $

Pertanto, x può essere rappresentato come una combinazione convessa di al massimo (k-1) punti.

Questo processo di riduzione può essere ripetuto fino a quando x non viene rappresentato come una combinazione convessa di (n + 1) elementi.

Sia S un insieme non vuoto, chiuso e limitato (chiamato anche insieme compatto) in $ \ mathbb {R} ^ n $ e $ f: S \ rightarrow \ mathbb {R} $ sia una funzione continua su S, quindi problema min $ \ sinistra \ {f \ sinistra (x \ destra): x \ in S \ destra \} $ raggiunge il suo minimo.

Prova

Poiché S è non vuoto e limitato, esiste un limite inferiore.

$ \ alpha = Inf \ sinistra \ {f \ sinistra (x \ destra): x \ in S \ destra \} $

Ora lascia $ S_j = \ left \ {x \ in S: \ alpha \ leq f \ left (x \ right) \ leq \ alpha + \ delta ^ j \ right \} \ forall j = 1,2, ... $ e $ \ delta \ in \ sinistra (0,1 \ destra) $

Secondo la definizione di infimium, $ S_j $ non è vuoto, per ogni $ j $.

Scegli un po 'di $ x_j \ in S_j $ per ottenere una sequenza $ \ left \ {x_j \ right \} $ per $ j = 1,2, ... $

Poiché S è limitato, anche la sequenza è limitata e c'è una sottosequenza convergente $ \ left \ {y_j \ right \} $, che converge a $ \ hat {x} $. Quindi $ \ hat {x} $ è un punto limite e S è chiuso, quindi $ \ hat {x} \ in S $. Poiché f è continuo, $ f \ left (y_i \ right) \ rightarrow f \ left (\ hat {x} \ right) $.

Dato che $ \ alpha \ leq f \ left (y_i \ right) \ leq \ alpha + \ delta ^ k, \ alpha = \ Displaystyle \ lim_ {k \ rightarrow \ infty} f \ left (y_i \ right) = f \ left ( \ hat {x} \ right) $

Quindi, $ \ hat {x} $ è la soluzione minimizzante.

Osservazioni

Ci sono due importanti condizioni necessarie affinché il teorema di Weierstrass valga. Questi sono i seguenti:

Step 1 - L'insieme S dovrebbe essere un insieme limitato.

Considera la funzione f \ left (x \ right) = x $.

È un insieme illimitato e ha un minimo in qualsiasi punto del suo dominio.

Quindi, per ottenere i minimi, S dovrebbe essere limitato.
Step 2 - Il set S deve essere chiuso.

Considera la funzione $ f \ left (x \ right) = \ frac {1} {x} $ nel dominio \ left (0,1 \ right).

Questa funzione non è chiusa nel dominio dato e anche i suoi minimi non esistono.

Quindi, per ottenere i minimi, S dovrebbe essere chiuso.

Sia S un insieme convesso chiuso non vuoto in $ \ mathbb {R} ^ n $ e $ y \ notin S $, allora $ \ esiste $ un punto $ \ bar {x} \ in S $ con una distanza minima da y, cioè $ \ left \ | y- \ bar {x} \ right \ | \ leq \ sinistra \ | yx \ right \ | \ forall x \ in S. $

Inoltre, $ \ bar {x} $ è un punto di minimizzazione se e solo se $ \ left (y- \ hat {x} \ right) ^ {T} \ left (x- \ hat {x} \ right) \ leq 0 $ o $ \ left (y- \ hat {x}, x- \ hat {x} \ right) \ leq 0 $

Prova

Esistenza del punto più vicino

Poiché $ S \ ne \ phi, \ esiste $ un punto $ \ hat {x} \ in S $ tale che la distanza minima di S da y sia minore o uguale a $ \ left \ | y- \ hat {x} \ right \ | $.

Definisci $ \ hat {S} = S \ cap \ left \ {x: \ left \ | yx \ destra \ | \ leq \ sinistra \ | y- \ hat {x} \ right \ | \ right \} $

Poiché $ \ hat {S} $ è chiuso e limitato, e poiché la norma è una funzione continua, per il teorema di Weierstrass esiste un punto minimo $ \ hat {x} \ in S $ tale che $ \ left \ | y- \ hat {x} \ right \ | = Inf \ left \ {\ left \ | yx \ right \ |, x \ in S \ right \} $

Unicità

Supponiamo $ \ bar {x} \ in S $ tale che $ \ left \ | y- \ hat {x} \ right \ | = \ left \ | y- \ hat {x} \ right \ | = \ alpha $

Poiché S è convesso, $ \ frac {\ hat {x} + \ bar {x}} {2} \ in S $

Ma $ \ left \ | y- \ frac {\ hat {x} - \ bar {x}} {2} \ right \ | \ leq \ frac {1} {2} \ left \ | y- \ hat {x} \ right \ | + \ frac {1} {2} \ left \ | y- \ bar {x} \ right \ | = \ alpha $

Non può essere una disuguaglianza rigorosa perché $ \ hat {x} $ è il più vicino a y.

Pertanto, $ \ left \ | y- \ hat {x} \ right \ | = \ mu \ left \ | y- \ hat {x} \ right \ | $, per qualche $ \ mu $

Ora $ \ sinistra \ | \ mu \ right \ | = 1. $ Se $ \ mu = -1 $, allora $ \ left (y- \ hat {x} \ right) = - \ left (y- \ hat {x} \ right) \ Freccia a destra y = \ frac {\ hat {x} + \ bar {x}} {2} \ in S $

Ma $ y \ in S $. Da qui la contraddizione. Quindi $ \ mu = 1 \ Rightarrow \ hat {x} = \ bar {x} $

Pertanto, il punto di minimizzazione è unico.

Per la seconda parte della dimostrazione, supponiamo $ \ left (y- \ hat {x} \ right) ^ {\ tau} \ left (x- \ bar {x} \ right) \ leq 0 $ per tutti $ x \ in S $

Adesso,

$ \ sinistra \ | yx \ destra \ | ^ {2} = \ sinistra \ | y- \ hat {x} + \ hat {x} -x \ right \ | ^ {2} = \ left \ | y- \ hat {x} \ right \ | ^ {2} + \ left \ | \ hat {x} -x \ right \ | ^ {2} +2 \ left (\ hat {x} -x \ right) ^ {\ tau} \ left (y- \ hat {x} \ right) $

$ \ Freccia destra \ sinistra \ | yx \ destra \ | ^ {2} \ geq \ sinistra \ | y- \ hat {x} \ right \ | ^ {2} $ perché $ \ left \ | \ hat {x} -x \ right \ | ^ {2} \ geq 0 $ e $ \ left (\ hat {x} - x \ right) ^ {T} \ left (y- \ hat {x} \ right ) \ geq 0 $

Quindi, $ \ hat {x} $ sta minimizzando il punto.

Al contrario, supponiamo che $ \ hat {x} $ sia il punto minimizimg.

$ \ Freccia destra \ sinistra \ | yx \ destra \ | ^ {2} \ geq \ sinistra \ | y- \ hat {x} \ right \ | ^ 2 \ forall x \ in S $

Poiché S è un insieme convesso.

$ \ Rightarrow \ lambda x + \ left (1- \ lambda \ right) \ hat {x} = \ hat {x} + \ lambda \ left (x- \ hat {x} \ right) \ in S $ per $ x \ in S $ e $ \ lambda \ in \ sinistra (0,1 \ destra) $

Ora $ \ left \ | y- \ hat {x} - \ lambda \ left (x- \ hat {x} \ right) \ right \ | ^ {2} \ geq \ left \ | y- \ hat {x} \ right \ | ^ 2 $

$ \ sinistra \ | y- \ hat {x} - \ lambda \ left (x- \ hat {x} \ right) \ right \ | ^ {2} = \ left \ | y- \ hat {x} \ right \ | ^ {2} + \ lambda ^ 2 \ left \ | x- \ hat {x} \ right \ | ^ {2} -2 \ lambda \ left (y- \ hat {x} \ right) ^ {T} \ left (x- \ hat {x} \ right) $

$ \ Freccia destra \ sinistra \ | y- \ hat {x} \ right \ | ^ {2} + \ lambda ^ {2} \ left \ | x- \ hat {x} \ right \ | -2 \ lambda \ left (y- \ hat {x} \ right) ^ {T} \ left (x- \ hat {x} \ right) \ geq \ left \ | y- \ hat {x} \ right \ | ^ {2} $

$ \ Rightarrow 2 \ lambda \ left (y- \ hat {x} \ right) ^ {T} \ left (x- \ hat {x} \ right) \ leq \ lambda ^ 2 \ left \ | x- \ hat {x} \ right \ | ^ 2 $

$ \ Rightarrow \ left (y- \ hat {x} \ right) ^ {T} \ left (x- \ hat {x} \ right) \ leq 0 $

Quindi dimostrato.

Sia S un insieme convesso chiuso non vuoto in $ \ mathbb {R} ^ n $ e $ y \ notin S $. Quindi, esiste un vettore diverso da zero $ p $ e $ \ beta $ scalare tali che $ p ^ T y> \ beta $ e $ p ^ T x <\ beta $ per ogni $ x \ in S $

Prova

Poiché S è un insieme convesso chiuso non vuoto e $ y \ notin S $ quindi per il teorema del punto più vicino, esiste un unico punto di minimizzazione $ \ hat {x} \ in S $ tale che

$ \ left (x- \ hat {x} \ right) ^ T \ left (y- \ hat {x} \ right) \ leq 0 \ forall x \ in S $

Siano $ p = \ left (y- \ hat {x} \ right) \ neq 0 $ e $ \ beta = \ hat {x} ^ T \ left (y- \ hat {x} \ right) = p ^ T \ hat {x} $.

Quindi $ \ left (x- \ hat {x} \ right) ^ T \ left (y- \ hat {x} \ right) \ leq 0 $

$ \ Freccia destra \ sinistra (y- \ hat {x} \ destra) ^ T \ sinistra (x- \ hat {x} \ destra) \ leq 0 $

$ \ Rightarrow \ left (y- \ hat {x} \ right) ^ Tx \ leq \ left (y- \ hat {x} \ right) ^ T \ hat {x} = \ hat {x} ^ T \ left (y- \ hat {x} \ right) $ i, e., $ p ^ Tx \ leq \ beta $

Inoltre, $ p ^ Ty- \ beta = \ left (y- \ hat {x} \ right) ^ Ty- \ hat {x} ^ T \ left (y- \ hat {x} \ right) $

$ = \ sinistra (y- \ hat {x} \ destra) ^ T \ sinistra (yx \ destra) = \ sinistra \ | y- \ hat {x} \ right \ | ^ {2}> 0 $

$ \ Freccia destra p ^ Ty> \ beta $

Questo teorema si traduce nella separazione degli iperpiani. Gli iperpiani basati sul teorema di cui sopra possono essere definiti come segue:

Siano $ S_1 $ e $ S_2 $ sottoinsiemi non vuoti di $ \ mathbb {R} $ e $ H = \ left \ {X: A ^ TX = b \ right \} $ un iperpiano.

Si dice che l'iperpiano H separa $ S_1 $ e $ S_2 $ se $ A ^ TX \ leq b \ forall X \ in S_1 $ e $ A_TX \ geq b \ forall X \ in S_2 $
Si dice che l'iperpiano H separa rigorosamente $ S_1 $ e $ S_2 $ se $ A ^ TX <b \ forall X \ in S_1 $ e $ A_TX> b \ forall X \ in S_2 $
Si dice che l'iperpiano H separa fortemente $ S_1 $ e $ S_2 $ se $ A ^ TX \ leq b \ forall X \ in S_1 $ e $ A_TX \ geq b + \ varepsilon \ forall X \ in S_2 $, dove $ \ varepsilon $ è uno scalare positivo.

Un insieme non vuoto C in $ \ mathbb {R} ^ n $ si dice cono con vertice 0 se $ x \ in C \ Rightarrow \ lambda x \ in C \ forall \ lambda \ geq 0 $.

Un insieme C è un cono convesso se convesso così come cono.

Ad esempio, $ y = \ left | x \ right | $ non è un cono convesso perché non è convesso.

Ma $ y \ geq \ left | x \ right | $ è un cono convesso perché è convesso oltre che cono.

Note - Un cono C è convesso se e solo se per ogni $ x, y \ in C, x + y \ in C $.

Prova

Poiché C è cono, per $ x, y \ in C \ Rightarrow \ lambda x \ in C $ e $ \ mu y \ in C \: \ forall \: \ lambda, \ mu \ geq 0 $

C è convesso se $ \ lambda x + \ left (1- \ lambda \ right) y \ in C \: \ forall \: \ lambda \ in \ left (0, 1 \ right) $

Poiché C è cono, $ \ lambda x \ in C $ e $ \ left (1- \ lambda \ right) y \ in C \ Leftrightarrow x, y \ in C $

Quindi C è convesso se $ x + y \ in C $

In generale, se $ x_1, x_2 \ in C $, allora, $ \ lambda_1x_1 + \ lambda_2x_2 \ in C, \ forall \ lambda_1, \ lambda_2 \ geq 0 $

Esempi

La combinazione conica di un insieme infinito di vettori in $ \ mathbb {R} ^ n $ è un cono convesso.
Ogni insieme vuoto è un cono convesso.
Qualsiasi funzione lineare è un cono convesso.
Poiché un iperpiano è lineare, è anche un cono convesso.
Anche i semispazi chiusi sono coni convessi.

Note - L'intersezione di due coni convessi è un cono convesso ma la loro unione può o non può essere un cono convesso.

Sia S un insieme non vuoto in $ \ mathbb {R} ^ n $ Allora, il cono polare di S indicato con $ S ^ * $ è dato da $ S ^ * = \ left \ {p \ in \ mathbb {R } ^ n, p ^ Tx \ leq 0 \: \ forall x \ in S \ right \} $.

Nota

Il cono polare è sempre convesso anche se S non è convesso.
Se S è un insieme vuoto, $ S ^ * = \ mathbb {R} ^ n $.
La polarità può essere vista come una generalizzazione dell'ortogonalità.

Sia $ C \ subseteq \ mathbb {R} ^ n $ quindi lo spazio ortogonale di C, indicato con $ C ^ \ perp = \ left \ {y \ in \ mathbb {R} ^ n: \ left \ langle x, y \ right \ rangle = 0 \ forall x \ in C \ right \} $.

Lemma

Siano $ S, S_1 $ e $ S_2 $ insiemi non vuoti in $ \ mathbb {R} ^ n $ allora le seguenti affermazioni sono vere -

$ S ^ * $ è un cono convesso chiuso.
$ S \ subseteq S ^ {**} $ dove $ S ^ {**} $ è un cono polare di $ S ^ * $.
$ S_1 \ subseteq S_2 \ Rightarrow S_ {2} ^ {*} \ subseteq S_ {1} ^ {*} $.

Prova

Step 1 - $ S ^ * = \ sinistra \ {p \ in \ mathbb {R} ^ n, p ^ Tx \ leq 0 \: \ forall \: x \ in S \ right \} $

Siano $ x_1, x_2 \ in S ^ * \ Rightarrow x_ {1} ^ {T} x \ leq 0 $ e $ x_ {2} ^ {T} x \ leq 0, \ forall x \ in S $

Per $ \ lambda \ in \ left (0, 1 \ right), \ left [\ lambda x_1 + \ left (1- \ lambda \ right) x_2 \ right] ^ Tx = \ left [\ left (\ lambda x_1 \ right ) ^ T + \ sinistra \ {\ sinistra (1- \ lambda \ destra) x_ {2} \ destra \} ^ {T} \ destra] x, \ forall x \ in S $

$ = \ left [\ lambda x_ {1} ^ {T} + \ left (1- \ lambda \ right) x_ {2} ^ {T} \ right] x = \ lambda x_ {1} ^ {T} x + \ left (1- \ lambda \ right) x_ {2} ^ {T} \ leq 0 $

Quindi $ \ lambda x_1 + \ left (1- \ lambda \ right) x_ {2} \ in S ^ * $

Pertanto $ S ^ * $ è un insieme convesso.
Per $ \ lambda \ geq 0, p ^ {T} x \ leq 0, \ forall \: x \ in S $

Pertanto, $ \ lambda p ^ T x \ leq 0, $

$ \ Freccia destra \ sinistra (\ lambda p \ destra) ^ T x \ leq 0 $

$ \ Rightarrow \ lambda p \ in S ^ * $

Quindi, $ S ^ * $ è un cono.
Per mostrare $ S ^ * $ è chiuso, cioè per mostrare se $ p_n \ rightarrow p $ come $ n \ rightarrow \ infty $, quindi $ p \ in S ^ * $

$ \ forall x \ in S, p_ {n} ^ {T} xp ^ T x = \ left (p_n-p \ right) ^ T x $

Come $ p_n \ rightarrow p $ come $ n \ rightarrow \ infty \ Rightarrow \ left (p_n \ rightarrow p \ right) \ rightarrow 0 $

Quindi $ p_ {n} ^ {T} x \ rightarrow p ^ {T} x $. Ma $ p_ {n} ^ {T} x \ leq 0, \: \ forall x \ in S $

Quindi, $ p ^ Tx \ leq 0, \ forall x \ in S $

$ \ Freccia destra p \ in S ^ * $

Quindi, $ S ^ * $ è chiuso.

Step 2 - $ S ^ {**} = \ sinistra \ {q \ in \ mathbb {R} ^ n: q ^ T p \ leq 0, \ forall p \ in S ^ * \ right \} $

Sia $ x \ in S $, poi $ \ forall p \ in S ^ *, p ^ T x \ leq 0 \ Rightarrow x ^ Tp \ leq 0 \ Rightarrow x \ in S ^ {**} $

Quindi, $ S \ subseteq S ^ {**} $

Step 3 - $ S_2 ^ * = \ sinistra \ {p \ in \ mathbb {R} ^ n: p ^ Tx \ leq 0, \ forall x \ in S_2 \ destra \} $

Da $ S_1 \ subseteq S_2 \ Rightarrow \ forall x \ in S_2 \ Rightarrow \ forall x \ in S_1 $

Pertanto, se $ \ hat {p} \ in S_2 ^ *, $ allora $ \ hat {p} ^ Tx \ leq 0, \ forall x \ in S_2 $

$ \ Rightarrow \ hat {p} ^ Tx \ leq 0, \ forall x \ in S_1 $

$ \ Rightarrow \ hat {p} ^ T \ in S_1 ^ * $

$ \ Rightarrow S_2 ^ * \ subseteq S_1 ^ * $

Teorema

Sia C un cono convesso chiuso non vuoto, quindi $ C = C ^ ** $

Prova

$ C = C ^ {**} $ dal lemma precedente.

Per provare: $ x \ in C ^ {**} \ subseteq C $

Sia $ x \ in C ^ {**} $ e $ x \ notin C $

Quindi, per teorema di separazione fondamentale, esiste un vettore $ p \ neq 0 $ e uno scalare $ \ alpha $ tale che $ p ^ Ty \ leq \ alpha, \ forall y \ in C $

Pertanto, $ p ^ Tx> \ alpha $

Ma poiché $ \ left (y = 0 \ right) \ in C $ e $ p ^ Ty \ leq \ alpha, \ forall y \ in C \ Rightarrow \ alpha \ geq 0 $ e $ p ^ Tx> 0 $

Se $ p \ notin C ^ * $, allora esiste un po 'di $ \ bar {y} \ in C $ tale che $ p ^ T \ bar {y}> 0 $ e $ p ^ T \ left (\ lambda \ bar {y} \ right) $ può essere reso arbitrariamente grande prendendo $ \ lambda $ sufficientemente grande.

Questo è in contraddizione con il fatto che $ p ^ Ty \ leq \ alpha, \ forall y \ in C $

Pertanto, $ p \ in C ^ * $

Poiché $ x \ in C ^ * = \ left \ {q: q ^ Tp \ leq 0, \ forall p \ in C ^ * \ right \} $

Pertanto, $ x ^ Tp \ leq 0 \ Rightarrow p ^ Tx \ leq 0 $

Ma $ p ^ Tx> \ alpha $

Così è il contardiction.

Quindi, $ x \ in C $

Quindi $ C = C ^ {**} $.

Un punto della forma $ \ alpha_1x_1 + \ alpha_2x_2 + .... + \ alpha_nx_n $ con $ \ alpha_1, \ alpha_2, ..., \ alpha_n \ geq 0 $ è chiamato combinazione conica di $ x_1, x_2, ..., x_n. $

Se $ x_i $ sono nel cono convesso C, allora ogni combinazione conica di $ x_i $ è anche in C.
Un insieme C è un cono convesso se contiene tutta la combinazione conica dei suoi elementi.

Scafo Conico

Uno scafo conico è definito come un insieme di tutte le combinazioni coniche di un dato insieme S ed è indicato con coni (S).

Quindi, $ coni \ left (S \ right) = \ left \ {\ displaystyle \ sum \ limits_ {i = 1} ^ k \ lambda_ix_i: x_i \ in S, \ lambda_i \ in \ mathbb {R}, \ lambda_i \ geq 0, i = 1,2, ... \ right \} $

Lo scafo conico è un insieme convesso.
L'origine appartiene sempre allo scafo conico.

Un insieme in $ \ mathbb {R} ^ n $ si dice poliedrico se è l'intersezione di un numero finito di semispazi chiusi, cioè

$ S = \ sinistra \ {x \ in \ mathbb {R} ^ n: p_ {i} ^ {T} x \ leq \ alpha_i, i = 1,2, ...., n \ destra \} $

Per esempio,

$ \ sinistra \ {x \ in \ mathbb {R} ^ n: AX = b \ destra \} $
$ \ sinistra \ {x \ in \ mathbb {R} ^ n: AX \ leq b \ right \} $
$ \ sinistra \ {x \ in \ mathbb {R} ^ n: AX \ geq b \ destra \} $

Cono poliedrico

Un insieme in $ \ mathbb {R} ^ n $ è detto cono poliedrico se è l'intersezione di un numero finito di mezzi spazi che contengono l'origine, cioè $ S = \ left \ {x \ in \ mathbb { R} ^ n: p_ {i} ^ {T} x \ leq 0, i = 1, 2, ... \ right \} $

Polytope

Un politopo è un insieme poliedrico che è limitato.

Osservazioni

Un politopo è uno scafo convesso di un insieme finito di punti.
Un cono poliedrico è generato da un insieme finito di vettori.
Un insieme poliedrico è un insieme chiuso.
Un insieme poliedrico è un insieme convesso.

Sia S un insieme convesso in $ \ mathbb {R} ^ n $. Un vettore $ x \ in S $ si dice che sia un punto estremo di S se $ x = \ lambda x_1 + \ left (1- \ lambda \ right) x_2 $ con $ x_1, x_2 \ in S $ e $ \ lambda \ in \ left (0, 1 \ right) \ Rightarrow x = x_1 = x_2 $.

Esempio

Step 1 - $ S = \ left \ {\ left (x_1, x_2 \ right) \ in \ mathbb {R} ^ 2: x_ {1} ^ {2} + x_ {2} ^ {2} \ leq 1 \ right \ } $

Punto estremo, $ E = \ left \ {\ left (x_1, x_2 \ right) \ in \ mathbb {R} ^ 2: x_ {1} ^ {2} + x_ {2} ^ {2} = 1 \ right \} $

Step 2 - $ S = \ sinistra \ {\ sinistra (x_1, x_2 \ destra) \ in \ mathbb {R} ^ 2: x_1 + x_2 <2, -x_1 + 2x_2 \ leq 2, x_1, x_2 \ geq 0 \ destra \ } $

Punto estremo, $ E = \ left \ {\ left (0, 0 \ right), \ left (2, 0 \ right), \ left (0, 1 \ right), \ left (\ frac {2} {3 }, \ frac {4} {3} \ right) \ right \} $

Step 3 - S è il politopo formato dai punti $ \ left \ {\ left (0,0 \ right), \ left (1,1 \ right), \ left (1,3 \ right), \ left (-2, 4 \ destra), \ sinistra (0,2 \ destra) \ destra \} $

Punto estremo, $ E = \ sinistra \ {\ sinistra (0,0 \ destra), \ sinistra (1,1 \ destra), \ sinistra (1,3 \ destra), \ sinistra (-2,4 \ destra) \ right \} $

Osservazioni

Qualsiasi punto dell'insieme convesso S può essere rappresentato come una combinazione convessa dei suoi punti estremi.
È vero solo per insiemi chiusi e limitati in $ \ mathbb {R} ^ n $.
Potrebbe non essere vero per i set illimitati.

k punti estremi

Un punto in un insieme convesso è chiamato estremo k se e solo se è il punto interno di un insieme convesso k-dimensionale all'interno di S, e non è un punto interno di un insieme convesso (k + 1) - dimensionale all'interno di S. Fondamentalmente, per un insieme convesso S, k punti estremi creano k facce aperte.

Sia S un insieme convesso chiuso in $ \ mathbb {R} ^ n $. Un vettore diverso da zero $ d \ in \ mathbb {R} ^ n $ è chiamato direzione di S se per ogni $ x \ in S, x + \ lambda d \ in S, \ forall \ lambda \ geq 0. $

Due direzioni $ d_1 $ e $ d_2 $ di S sono chiamate distinte se $ d \ neq \ alpha d_2 $ per $ \ alpha> 0 $.
Una direzione $ d $ di $ S $ è detta direzione estrema se non può essere scritta come una combinazione lineare positiva di due direzioni distinte, ovvero, se $ d = \ lambda _1d_1 + \ lambda _2d_2 $ per $ \ lambda _1, \ lambda _2> 0 $, quindi $ d_1 = \ alpha d_2 $ per $ \ alpha $.
Qualsiasi altra direzione può essere espressa come una combinazione positiva di direzioni estreme.
Per un insieme convesso $ S $, viene chiamata la direzione d tale che $ x + \ lambda d \ in S $ per qualche $ x \ in S $ e tutti i $ \ lambda \ geq0 $ recessive per $ S $.
Sia E l'insieme dei punti in cui una certa funzione $ f: S \ rightarrow $ su un insieme convesso non vuoto S in $ \ mathbb {R} ^ n $ raggiunge il suo massimo, quindi $ E $ è chiamata faccia esposta di $ S $. Le direzioni delle facce esposte sono chiamate direzioni esposte.
Un raggio la cui direzione è una direzione estrema è chiamato raggio estremo.

Esempio

Considera la funzione $ f \ left (x \ right) = y = \ left | x \ right | $, dove $ x \ in \ mathbb {R} ^ n $. Sia d il vettore unitario in $ \ mathbb {R} ^ n $

Quindi, d è la direzione per la funzione f perché per ogni $ \ lambda \ geq 0, x + \ lambda d \ in f \ left (x \ right) $.

Sia $ f: S \ rightarrow \ mathbb {R} $, dove S è convesso non vuoto impostato in $ \ mathbb {R} ^ n $, quindi $ f \ left (x \ right) $ si dice convesso su S se $ f \ sinistra (\ lambda x_1 + \ sinistra (1- \ lambda \ destra) x_2 \ destra) \ leq \ lambda f \ sinistra (x_1 \ destra) + \ sinistra (1- \ lambda \ destra) f \ sinistra ( x_2 \ right), \ forall \ lambda \ in \ left (0,1 \ right) $.

D'altra parte, Sia $ f: S \ rightarrow \ mathbb {R} $, dove S è convesso non vuoto impostato in $ \ mathbb {R} ^ n $, quindi $ f \ left (x \ right) $ è detto essere concavo su S se $ f \ left (\ lambda x_1 + \ left (1- \ lambda \ right) x_2 \ right) \ geq \ lambda f \ left (x_1 \ right) + \ left (1- \ lambda \ right ) f \ sinistra (x_2 \ destra), \ forall \ lambda \ in \ sinistra (0, 1 \ destra) $.

Sia $ f: S \ rightarrow \ mathbb {R} $ dove S è convesso non vuoto impostato in $ \ mathbb {R} ^ n $, quindi $ f \ left (x \ right) $ è detto strettamente convesso su S se $ f \ sinistra (\ lambda x_1 + \ sinistra (1- \ lambda \ destra) x_2 \ destra) <\ lambda f \ sinistra (x_1 \ destra) + \ sinistra (1- \ lambda \ destra) f \ sinistra (x_2 \ right), \ forall \ lambda \ in \ left (0, 1 \ right) $.

Sia $ f: S \ rightarrow \ mathbb {R} $ dove S è convesso non vuoto impostato in $ \ mathbb {R} ^ n $, quindi $ f \ left (x \ right) $ è detto strettamente concavo su S se $ f \ sinistra (\ lambda x_1 + \ sinistra (1- \ lambda \ destra) x_2 \ destra)> \ lambda f \ sinistra (x_1 \ destra) + \ sinistra (1- \ lambda \ destra) f \ sinistra (x_2 \ right), \ forall \ lambda \ in \ left (0, 1 \ right) $.

Esempi

Una funzione lineare è sia convessa che concava.
$ f \ sinistra (x \ destra) = \ sinistra | x \ right | $ è una funzione convessa.
$ f \ left (x \ right) = \ frac {1} {x} $ è una funzione convessa.

Teorema

Siano $ f_1, f_2, ..., f_k: \ mathbb {R} ^ n \ rightarrow \ mathbb {R} $ funzioni convesse. Considera la funzione $ f \ left (x \ right) = \ displaystyle \ sum \ limits_ {j = 1} ^ k \ alpha_jf_j \ left (x \ right) $ dove $ \ alpha_j> 0, j = 1, 2,. ..k, $ allora $ f \ left (x \ right) $ è una funzione convessa.

Prova

Poiché $ f_1, f_2, ... f_k $ sono funzioni convesse

Pertanto, $ f_i \ left (\ lambda x_1 + \ left (1- \ lambda \ right) x_2 \ right) \ leq \ lambda f_i \ left (x_1 \ right) + \ left (1- \ lambda \ right) f_i \ left (x_2 \ right), \ forall \ lambda \ in \ left (0, 1 \ right) $ e $ i = 1, 2, ...., k $

Considera la funzione $ f \ left (x \ right) $.

Perciò,

$ f \ sinistra (\ lambda x_1 + \ sinistra (1- \ lambda \ destra) x_2 \ destra) $

$ = \ displaystyle \ sum \ limits_ {j = 1} ^ k \ alpha_jf_j \ left (\ lambda x_1 + 1- \ lambda \ right) x_2 \ leq \ displaystyle \ sum \ limits_ {j = 1} ^ k \ alpha_j \ lambda f_j \ sinistra (x_1 \ destra) + \ sinistra (1- \ lambda \ destra) f_j \ sinistra (x_2 \ destra) $

$ \ Freccia destra f \ sinistra (\ lambda x_1 + \ sinistra (1- \ lambda \ destra) x_2 \ destra) \ leq \ lambda \ sinistra (\ Displaystyle \ sum \ limits_ {j = 1} ^ k \ alpha _jf_j \ sinistra ( x_1 \ right) \ right) + \ left (\ displaystyle \ sum \ limits_ {j = 1} ^ k \ alpha _jf_j \ left (x_2 \ right) \ right) $

$ \ Freccia destra f \ sinistra (\ lambda x_1 + \ sinistra (1- \ lambda \ destra) x_2 \ destra) \ leq \ lambda f \ sinistra (x_2 \ destra) \ leq \ sinistra (1- \ lambda \ destra) f \ sinistra (x_2 \ destra) $

Quindi, $ f \ left (x \ right) $ è una funzione convessa.

Teorema

Sia $ f \ left (x \ right) $ una funzione convessa su un insieme convesso $ S \ subset \ mathbb {R} ^ n $ allora un minimo locale di $ f \ left (x \ right) $ su S è un minimi globali.

Prova

Sia $ \ hat {x} $ un minimo locale per $ f \ left (x \ right) $ e $ \ hat {x} $ non è un minimo globale.

quindi, $ \ esiste \ hat {x} \ in S $ tale che $ f \ left (\ bar {x} \ right) <f \ left (\ hat {x} \ right) $

Poiché $ \ hat {x} $ è un minimo locale, esiste un quartiere $ N_ \ varepsilon \ left (\ hat {x} \ right) $ tale che $ f \ left (\ hat {x} \ right) \ leq f \ left (x \ right), \ forall x \ in N_ \ varepsilon \ left (\ hat {x} \ right) \ cap S $

Ma $ f \ left (x \ right) $ è una funzione convessa su S, quindi per $ \ lambda \ in \ left (0, 1 \ right) $

abbiamo $ \ lambda \ hat {x} + \ left (1- \ lambda \ right) \ bar {x} \ leq \ lambda f \ left (\ hat {x} \ right) + \ left (1- \ lambda \ right) f \ left (\ bar {x} \ right) $

$ \ Rightarrow \ lambda \ hat {x} + \ left (1- \ lambda \ right) \ bar {x} <\ lambda f \ left (\ hat {x} \ right) + \ left (1- \ lambda \ destra) f \ sinistra (\ hat {x} \ destra) $

$ \ Rightarrow \ lambda \ hat {x} + \ left (1- \ lambda \ right) \ bar {x} <f \ left (\ hat {x} \ right), \ forall \ lambda \ in \ left (0 , 1 \ destra) $

Ma per qualche $ \ lambda <1 $ ma vicino a 1, abbiamo

$ \ lambda \ hat {x} + \ left (1- \ lambda \ right) \ bar {x} \ in N_ \ varepsilon \ left (\ hat {x} \ right) \ cap S $ e $ f \ left ( \ lambda \ hat {x} + \ left (1- \ lambda \ right) \ bar {x} \ right) <f \ left (\ bar {x} \ right) $

che è una contraddizione.

Quindi, $ \ bar {x} $ è un minimo globale.

Epigrafe

sia S un sottoinsieme non vuoto in $ \ mathbb {R} ^ n $ e sia $ f: S \ rightarrow \ mathbb {R} $ allora l'epigrafe di f denotata da epi (f) o $ E_f $ è un sottoinsieme di $ \ mathbb {R} ^ n + 1 $ definito da $ E_f = \ left \ {\ left (x, \ alpha \ right): x \ in \ mathbb {R} ^ n, \ alpha \ in \ mathbb { R}, f \ sinistra (x \ destra) \ leq \ alpha \ destra \} $

Ipografo

sia S un sottoinsieme non vuoto in $ \ mathbb {R} ^ n $ e $ f: S \ rightarrow \ mathbb {R} $, quindi l'ipografo di f denotato da hyp (f) o $ H_f = \ left \ {\ left (x, \ alpha \ right): x \ in \ mathbb {R} ^ n, \ alpha \ in \ mathbb {R} ^ n, \ alpha \ in \ mathbb {R}, f \ left ( x \ right) \ geq \ alpha \ right \} $

Teorema

Sia S un insieme convesso non vuoto in $ \ mathbb {R} ^ n $ e $ f: S \ rightarrow \ mathbb {R} ^ n $, allora f è convesso se e solo se la sua epigrafe $ E_f $ è un insieme convesso.

Prova

Sia f una funzione convessa.

Mostrare $ E_f $ è un insieme convesso.

Lascia $ \ left (x_1, \ alpha_1 \ right), \ left (x_2, \ alpha_2 \ right) \ in E_f, \ lambda \ in \ left (0, 1 \ right) $

Per mostrare $ \ lambda \ left (x_1, \ alpha_1 \ right) + \ left (1- \ lambda \ right) \ left (x_2, \ alpha_2 \ right) \ in E_f $

$ \ Rightarrow \ left [\ lambda x_1 + \ left (1- \ lambda \ right) x_2, \ lambda \ alpha_1 + \ left (1- \ lambda \ right) \ alpha_2 \ right] \ in E_f $

$ f \ sinistra (x_1 \ destra) \ leq \ alpha _1, f \ sinistra (x_2 \ destra) \ leq \ alpha _2 $

Pertanto, $ f \ left (\ lambda x_1 + \ left (1- \ lambda \ right) x_2 \ right) \ leq \ lambda f \ left (x_1 \ right) + \ left (1- \ lambda \ right) f \ left (x_2 \ destra) $

$ \ Freccia destra f \ sinistra (\ lambda x_1 + \ sinistra (1- \ lambda \ destra) x_2 \ destra) \ leq \ lambda \ alpha_1 + \ sinistra (1- \ lambda \ destra) \ alpha_2 $

conversare

Sia $ E_f $ un insieme convesso.

Mostrare f è convesso.

cioè, per mostrare se $ x_1, x_2 \ in S, \ lambda \ left (0, 1 \ right) $

$ f \ sinistra (\ lambda x_1 + \ sinistra (1- \ lambda \ destra) x_2 \ destra) \ leq \ lambda f \ sinistra (x_1 \ destra) + \ sinistra (1- \ lambda \ destra) f \ sinistra (x_2 \ destra) $

Lascia $ x_1, x_2 \ in S, \ lambda \ in \ left (0, 1 \ right), f \ left (x_1 \ right), f \ left (x_2 \ right) \ in \ mathbb {R} $

Poiché $ E_f $ è un insieme convesso, $ \ left (\ lambda x_1 + \ left (1- \ lambda \ right) x_2, \ lambda f \ left (x_1 \ right) + \ left (1- \ lambda \ right) \ destra) f \ sinistra (x_2 \ destra) \ in E_f $

Pertanto, $ f \ left (\ lambda x_1 + \ left (1- \ lambda \ right) x_2 \ right) \ leq \ lambda f \ left (x_1 \ right) + \ left (1- \ lambda \ right) f \ left (x_2 \ destra) $

Sia S un insieme convesso non vuoto in $ \ mathbb {R} ^ n $ e $ f: S \ rightarrow \ mathbb {R} ^ n $. Allora f è convesso se e solo se per ogni intero $ k> 0 $

$ x_1, x_2, ... x_k \ in S, \ displaystyle \ sum \ limits_ {i = 1} ^ k \ lambda_i = 1, \ lambda_i \ geq 0, \ forall i = 1,2, s, k $, abbiamo $ f \ left (\ displaystyle \ sum \ limits_ {i = 1} ^ k \ lambda_ix_i \ right) \ leq \ displaystyle \ sum \ limits_ {i = 1} ^ k \ lambda _if \ left (x \ right) $

Prova

Per induzione su k.

$ k = 1: x_1 \ in S $ Quindi $ f \ left (\ lambda_1 x_1 \ right) \ leq \ lambda_i f \ left (x_1 \ right) $ perché $ \ lambda_i = 1 $.

$ k = 2: \ lambda_1 + \ lambda_2 = 1 $ e $ x_1, x_2 \ in S $

Pertanto, $ \ lambda_1x_1 + \ lambda_2x_2 \ in S $

Quindi, per definizione, $ f \ left (\ lambda_1 x_1 + \ lambda_2 x_2 \ right) \ leq \ lambda _1f \ left (x_1 \ right) + \ lambda _2f \ left (x_2 \ right) $

Lascia che l'affermazione sia vera per $ n <k $

Perciò,

$ f \ sinistra (\ lambda_1 x_1 + \ lambda_2 x_2 + .... + \ lambda_k x_k \ destra) \ leq \ lambda_1 f \ sinistra (x_1 \ destra) + \ lambda_2 f \ sinistra (x_2 \ destra) + ... + \ lambda_k f \ left (x_k \ right) $

$ k = n + 1: $ Sia $ x_1, x_2, .... x_n, x_ {n + 1} \ in S $ e $ \ displaystyle \ sum \ limits_ {i = 1} ^ {n + 1} = 1 $

Quindi $ \ mu_1x_1 + \ mu_2x_2 + ....... + \ mu_nx_n + \ mu_ {n + 1} x_ {n + 1} \ in S $

quindi $ f \ left (\ mu_1x_1 + \ mu_2x_2 + ... + \ mu_nx_n + \ mu_ {n + 1} x_ {n + 1} \ right) $

$ = f \ sinistra (\ sinistra (\ mu_1 + \ mu_2 + ... + \ mu_n \ destra) \ frac {\ mu_1x_1 + \ mu_2x_2 + ... + \ mu_nx_n} {\ mu_1 + \ mu_2 + \ mu_3} + \ mu_ {n + 1} x_ {n + 1} \ right) $

$ = f \ left (\ mu_y + \ mu_ {n + 1} x_ {n + 1} \ right) $ dove $ \ mu = \ mu_1 + \ mu_2 + ... + \ mu_n $ e

$ y = \ frac {\ mu_1x_1 + \ mu_2x_2 + ... + \ mu_nx_n} {\ mu_1 + \ mu_2 + ... + \ mu_n} $ e anche $ \ mu_1 + \ mu_ {n + 1} = 1, y \ in S $

$ \ Freccia destra f \ sinistra (\ mu_1x_1 + \ mu_2x_2 + ... + \ mu_nx_n + \ mu_ {n + 1} x_ {n + 1} \ destra) \ leq \ mu f \ sinistra (y \ destra) + \ mu_ {n +1} f \ sinistra (x_ {n + 1} \ destra) $

$ \ Freccia destra f \ sinistra (\ mu_1x_1 + \ mu_2x_2 + ... + \ mu_nx_n + \ mu_ {n + 1} x_ {n + 1} \ destra) \ leq $

$ \ sinistra (\ mu_1 + \ mu_2 + ... + \ mu_n \ destra) f \ sinistra (\ frac {\ mu_1x_1 + \ mu_2x_2 + ... + \ mu_nx_n} {\ mu_1 + \ mu_2 + ... + \ mu_n} \ destra) + \ mu_ {n + 1} f \ sinistra (x_ {n + 1} \ destra) $

$ \ Freccia destra f \ sinistra (\ mu_1x_1 + \ mu_2x_2 + ... + \ mu_nx_n + \ mu_ {n + 1} x_ {n + 1} \ destra) \ leq \ sinistra (\ mu_1 + \ mu_2 + ... + \ mu_n \ a destra) $

$ \ left [\ frac {\ mu_1} {\ mu_1 + \ mu_2 + ... + \ mu_n} f \ left (x_1 \ right) + ... + \ frac {\ mu_n} {\ mu_1 + \ mu_2 + ... + \ mu_n} f \ sinistra (x_n \ destra) \ destra] + \ mu_ {n + 1} f \ sinistra (x_ {n + 1} \ destra) $

$ \ Freccia destra f \ sinistra (\ mu_1x_1 + \ mu_2x_2 + ... + \ mu_nx_n + \ mu_ {n + 1} x_ {n + 1} \ destra) \ leq \ mu_1f \ sinistra (x_1 \ destra) + \ mu_2f \ sinistra ( x_2 \ destra) + .... $

Quindi dimostrato.

Sia S un insieme aperto non vuoto in $ \ mathbb {R} ^ n $, allora $ f: S \ rightarrow \ mathbb {R} $ si dice differenziabile in $ \ hat {x} \ in S $ se esiste un vettore $ \ bigtriangledown f \ left (\ hat {x} \ right) $ chiamato vettore gradiente e una funzione $ \ alpha: \ mathbb {R} ^ n \ rightarrow \ mathbb {R} $ tale che

$ f \ sinistra (x \ destra) = f \ sinistra (\ hat {x} \ destra) + \ bigtriangledown f \ sinistra (\ hat {x} \ destra) ^ T \ sinistra (x- \ hat {x} \ destra) + \ sinistra \ | x = \ hat {x} \ right \ | \ alpha \ left (\ hat {x}, x- \ hat {x} \ right), \ forall x \ in S $ dove

$ \ alpha \ left (\ hat {x}, x- \ hat {x} \ right) \ rightarrow 0 \ bigtriangledown f \ left (\ hat {x} \ right) = \ left [\ frac {\ partial f} {\ partial x_1} \ frac {\ partial f} {\ partial x_2} ... \ frac {\ partial f} {\ partial x_n} \ right] _ {x = \ hat {x}} ^ {T} $

Teorema

sia S un convesso non vuoto, aperto in $ \ mathbb {R} ^ n $ e $ f: S \ rightarrow \ mathbb {R} $ sia differenziabile su S. Allora, f è convesso se e solo se per $ x_1, x_2 \ in S, \ bigtriangledown f \ sinistra (x_2 \ destra) ^ T \ sinistra (x_1-x_2 \ destra) \ leq f \ sinistra (x_1 \ destra) -f \ sinistra (x_2 \ destra) $

Prova

Sia f una funzione convessa. cioè, per $ x_1, x_2 \ in S, \ lambda \ in \ left (0, 1 \ right) $

$ f \ sinistra [\ lambda x_1 + \ sinistra (1- \ lambda \ destra) x_2 \ destra] \ leq \ lambda f \ sinistra (x_1 \ destra) + \ sinistra (1- \ lambda \ destra) f \ sinistra (x_2 \ destra) $

$ \ Freccia destra f \ sinistra [\ lambda x_1 + \ sinistra (1- \ lambda \ destra) x_2 \ destra] \ leq \ lambda \ sinistra (f \ sinistra (x_1 \ destra) -f \ sinistra (x_2 \ destra) \ destra ) + f \ sinistra (x_2 \ destra) $

$ \ Freccia destra \ lambda \ sinistra (f \ sinistra (x_1 \ destra) -f \ sinistra (x_2 \ destra) \ destra) \ geq f \ sinistra (x_2 + \ lambda \ sinistra (x_1-x_2 \ destra) \ destra) - f \ sinistra (x_2 \ destra) $

$ \ Rightarrow \ lambda \ left (f \ left (x_1 \ right) -f \ left (x_2 \ right) \ right) \ geq f \ left (x_2 \ right) + \ bigtriangledown f \ left (x_2 \ right) ^ T \ sinistra (x_1-x_2 \ destra) \ lambda + $

$ \ sinistra \ | \ lambda \ sinistra (x_1-x_2 \ destra) \ destra \ | \ alpha \ sinistra (x_2, \ lambda \ sinistra (x_1 - x_2 \ destra) -f \ sinistra (x_2 \ destra) \ destra) $

dove $ \ alpha \ left (x_2, \ lambda \ left (x_1 - x_2 \ right) \ right) \ rightarrow 0 $ come $ \ lambda \ rightarrow 0 $

Dividendo per $ \ lambda $ su entrambi i lati, otteniamo -

$ f \ sinistra (x_1 \ destra) -f \ sinistra (x_2 \ destra) \ geq \ bigtriangledown f \ sinistra (x_2 \ destra) ^ T \ sinistra (x_1-x_2 \ destra) $

conversare

Lascia per $ x_1, x_2 \ in S, \ bigtriangledown f \ left (x_2 \ right) ^ T \ left (x_1-x_2 \ right) \ leq f \ left (x_1 \ right) -f \ left (x_2 \ right) $

Per mostrare che f è convesso.

Poiché S è convesso, $ x_3 = \ lambda x_1 + \ left (1- \ lambda \ right) x_2 \ in S, \ lambda \ in \ left (0, 1 \ right) $

Poiché $ x_1, x_3 \ in S $, quindi

$ f \ sinistra (x_1 \ destra) -f \ sinistra (x_3 \ destra) \ geq \ bigtriangledown f \ sinistra (x_3 \ destra) ^ T \ sinistra (x_1 -x_3 \ destra) $

$ \ Freccia destra f \ sinistra (x_1 \ destra) -f \ sinistra (x_3 \ destra) \ geq \ bigtriangledown f \ sinistra (x_3 \ destra) ^ T \ sinistra (x_1 - \ lambda x_1- \ sinistra (1- \ lambda \ destra) x_2 \ destra) $

$ \ Freccia destra f \ sinistra (x_1 \ destra) -f \ sinistra (x_3 \ destra) \ geq \ sinistra (1- \ lambda \ destra) \ bigtriangledown f \ sinistra (x_3 \ destra) ^ T \ sinistra (x_1 - x_2 \ destra) $

Poiché, quindi, $ x_2, x_3 \ in S $

$ f \ sinistra (x_2 \ destra) -f \ sinistra (x_3 \ destra) \ geq \ bigtriangledown f \ sinistra (x_3 \ destra) ^ T \ sinistra (x_2-x_3 \ destra) $

$ \ Freccia destra f \ sinistra (x_2 \ destra) -f \ sinistra (x_3 \ destra) \ geq \ bigtriangledown f \ sinistra (x_3 \ destra) ^ T \ sinistra (x_2- \ lambda x_1- \ sinistra (1- \ lambda \ destra) x_2 \ destra) $

$ \ Freccia destra f \ sinistra (x_2 \ destra) -f \ sinistra (x_3 \ destra) \ geq \ sinistra (- \ lambda \ destra) \ bigtriangledown f \ sinistra (x_3 \ destra) ^ T \ sinistra (x_1-x_2 \ a destra) $

Quindi, combinando le equazioni di cui sopra, otteniamo:

$ \ lambda \ sinistra (f \ sinistra (x_1 \ destra) -f \ sinistra (x_3 \ destra) \ destra) + \ sinistra (1- \ lambda \ destra) \ sinistra (f \ sinistra (x_2 \ destra) -f \ sinistra (x_3 \ destra) \ destra) \ geq 0 $

$ \ Freccia destra f \ sinistra (x_3 \ destra) \ leq \ lambda f \ sinistra (x_1 \ destra) + \ sinistra (1- \ lambda \ destra) f \ sinistra (x_2 \ destra) $

Teorema

sia S un insieme convesso aperto non vuoto in $ \ mathbb {R} ^ n $ e $ f: S \ rightarrow \ mathbb {R} $ sia differenziabile su S, allora f è convesso su S se e solo se per qualsiasi $ x_1, x_2 \ in S, \ left (\ bigtriangledown f \ left (x_2 \ right) - \ bigtriangledown f \ left (x_1 \ right) \ right) ^ T \ left (x_2-x_1 \ right) \ geq 0 $

Prova

sia f una funzione convessa, quindi usando il teorema precedente -

$ \ bigtriangledown f \ left (x_2 \ right) ^ T \ left (x_1-x_2 \ right) \ leq f \ left (x_1 \ right) -f \ left (x_2 \ right) $ e

$ \ bigtriangledown f \ sinistra (x_1 \ destra) ^ T \ sinistra (x_2-x_1 \ destra) \ leq f \ sinistra (x_2 \ destra) -f \ sinistra (x_1 \ destra) $

Aggiungendo le due equazioni precedenti, otteniamo:

$ \ bigtriangledown f \ sinistra (x_2 \ destra) ^ T \ sinistra (x_1-x_2 \ destra) + \ bigtriangledown f \ sinistra (x_1 \ destra) ^ T \ sinistra (x_2-x_1 \ destra) \ leq 0 $

$ \ Rightarrow \ left (\ bigtriangledown f \ left (x_2 \ right) - \ bigtriangledown f \ left (x_1 \ right) \ right) ^ T \ left (x_1-x_2 \ right) \ leq 0 $

$ \ Rightarrow \ left (\ bigtriangledown f \ left (x_2 \ right) - \ bigtriangledown f \ left (x_1 \ right) \ right) ^ T \ left (x_2-x_1 \ right) \ geq 0 $

conversare

Lascia per qualsiasi $ x_1, x_2 \ in S, \ left (\ bigtriangledown f \ left (x_2 \ right) - \ bigtriangledown f \ left (x_1 \ right) \ right) ^ T \ left (x_2-x_1 \ right) \ geq 0 $

Per mostrare che f è convesso.

Siano $ x_1, x_2 \ in S $, quindi per teorema del valore medio, $ \ frac {f \ left (x_1 \ right) -f \ left (x_2 \ right)} {x_1-x_2} = \ bigtriangledown f \ left ( x \ right), x \ in \ left (x_1-x_2 \ right) \ Rightarrow x = \ lambda x_1 + \ left (1- \ lambda \ right) x_2 $ perché S è un insieme convesso.

$ \ Freccia destra f \ sinistra (x_1 \ destra) - f \ sinistra (x_2 \ destra) = \ sinistra (\ bigtriangledown f \ sinistra (x \ destra) ^ T \ destra) \ sinistra (x_1-x_2 \ destra) $

per $ x, x_1 $, sappiamo -

$ \ left (\ bigtriangledown f \ left (x \ right) - \ bigtriangledown f \ left (x_1 \ right) \ right) ^ T \ left (x-x_1 \ right) \ geq 0 $

$ \ Rightarrow \ left (\ bigtriangledown f \ left (x \ right) - \ bigtriangledown f \ left (x_1 \ right) \ right) ^ T \ left (\ lambda x_1 + \ left (1- \ lambda \ right) x_2- x_1 \ destra) \ geq 0 $

$ \ Rightarrow \ left (\ bigtriangledown f \ left (x \ right) - \ bigtriangledown f \ left (x_1 \ right) \ right) ^ T \ left (1- \ lambda \ right) \ left (x_2-x_1 \ right ) \ geq 0 $

$ \ Rightarrow \ bigtriangledown f \ left (x \ right) ^ T \ left (x_2-x_1 \ right) \ geq \ bigtriangledown f \ left (x_1 \ right) ^ T \ left (x_2-x_1 \ right) $

Combinando le equazioni precedenti, otteniamo -

$ \ Rightarrow \ bigtriangledown f \ left (x_1 \ right) ^ T \ left (x_2-x_1 \ right) \ leq f \ left (x_2 \ right) -f \ left (x_1 \ right) $

Quindi usando l'ultimo teorema, f è una funzione convessa.

Funzione due volte differenziabili

Sia S un sottoinsieme non vuoto di $ \ mathbb {R} ^ n $ e $ f: S \ rightarrow \ mathbb {R} $ allora f si dice differenziabile due volte in $ \ bar {x} \ in S $ se esiste un vettore $ \ bigtriangledown f \ left (\ bar {x} \ right), una \: nXn $ matrice $ H \ left (x \ right) $ (chiamata matrice hessiana) e una funzione $ \ alpha: \ mathbb {R} ^ n \ rightarrow \ mathbb {R} $ tale che $ f \ left (x \ right) = f \ left (\ bar {x} + x- \ bar {x} \ right) = f \ sinistra (\ bar {x} \ destra) + \ bigtriangledown f \ sinistra (\ bar {x} \ destra) ^ T \ sinistra (x- \ bar {x} \ destra) + \ frac {1} {2} \ sinistra (x- \ bar {x} \ destra) H \ sinistra (\ bar {x} \ destra) \ sinistra (x- \ bar {x} \ destra) $

dove $ \ alpha \ left (\ bar {x}, x- \ bar {x} \ right) \ rightarrow Oasx \ rightarrow \ bar {x} $

Teorema

Sia f una funzione differenziabile due volte. Se $ \ bar {x} $ è un minimo locale, allora $ \ bigtriangledown f \ left (\ bar {x} \ right) = 0 $ e la matrice Hessiana $ H \ left (\ bar {x} \ right) $ è un semidefinito positivo.

Prova

Sia $ d \ in \ mathbb {R} ^ n $. Poiché f è differenziabile due volte in $ \ bar {x} $.

Perciò,

$ f \ sinistra (\ bar {x} + \ lambda d \ destra) = f \ sinistra (\ bar {x} \ destra) + \ lambda \ bigtriangledown f \ sinistra (\ bar {x} \ destra) ^ T d + \ lambda ^ 2d ^ TH \ sinistra (\ bar {x} \ destra) d + \ lambda ^ 2d ^ TH \ sinistra (\ bar {x} \ destra) d + $

$ \ lambda ^ 2 \ sinistra \ | d \ right \ | ^ 2 \ beta \ left (\ bar {x}, \ lambda d \ right) $

Ma $ \ bigtriangledown f \ left (\ bar {x} \ right) = 0 $ e $ \ beta \ left (\ bar {x}, \ lambda d \ right) \ rightarrow 0 $ come $ \ lambda \ rightarrow 0 $

$ \ Freccia destra f \ sinistra (\ bar {x} + \ lambda d \ destra) -f \ sinistra (\ bar {x} \ destra) = \ lambda ^ 2d ^ TH \ sinistra (\ bar {x} \ destra) d $

Poiché $ \ bar {x} $ è un minimo locale, esiste un $ \ delta> 0 $ tale che $ f \ left (x \ right) \ leq f \ left (\ bar {x} + \ lambda d \ right ), \ forall \ lambda \ in \ left (0, \ delta \ right) $

Teorema

Sia $ f: S \ rightarrow \ mathbb {R} ^ n $ dove $ S \ subset \ mathbb {R} ^ n $ differenziabile due volte su S. Se $ \ bigtriangledown f \ left (x \ right) = 0 $ e $ H \ left (\ bar {x} \ right) $ è semi-definito positivo, per tutti i $ x \ in S $, allora $ \ bar {x} $ è una soluzione ottimale globale.

Prova

Poiché $ H \ left (\ bar {x} \ right) $ è semidefinita positiva, f è una funzione convessa su S. Poiché f è differenziabile e convessa in $ \ bar {x} $

$ \ bigtriangledown f \ left (\ bar {x} \ right) ^ T \ left (x- \ bar {x} \ right) \ leq f \ left (x \ right) -f \ left (\ bar {x} \ right), \ forall x \ in S $

Dato che $ \ bigtriangledown f \ left (\ bar {x} \ right) = 0, f \ left (x \ right) \ geq f \ left (\ bar {x} \ right) $

Quindi, $ \ bar {x} $ è un optima globale.

Teorema

Supponiamo che $ \ bar {x} \ in S $ sia una soluzione ottimale locale al problema $ f: S \ rightarrow \ mathbb {R} $ dove S è un sottoinsieme non vuoto di $ \ mathbb {R} ^ n $ e S è convesso. $ min \: f \ sinistra (x \ destra) $ dove $ x \ in S $.

Poi:

$ \ bar {x} $ è una soluzione ottimale globale.
Se $ \ bar {x} $ è un minimo strettamente locale oppure f è una funzione strettamente convessa, allora $ \ bar {x} $ è l'unica soluzione ottimale globale ed è anche un minimo locale forte.

Prova

Sia $ \ bar {x} $ un'altra soluzione ottimale globale al problema tale che $ x \ neq \ bar {x} $ e $ f \ left (\ bar {x} \ right) = f \ left (\ hat { x} \ right) $

Poiché $ \ hat {x}, \ bar {x} \ in S $ e S è convesso, allora $ \ frac {\ hat {x} + \ bar {x}} {2} \ in S $ ed f è strettamente convesso.

$ \ Rightarrow f \ left (\ frac {\ hat {x} + \ bar {x}} {2} \ right) <\ frac {1} {2} f \ left (\ bar {x} \ right) + \ frac {1} {2} f \ left (\ hat {x} \ right) = f \ left (\ hat {x} \ right) $

Questa è una contraddizione.

Quindi, $ \ hat {x} $ è una soluzione ottimale globale unica.

Corollario

Sia $ f: S \ subset \ mathbb {R} ^ n \ rightarrow \ mathbb {R} $ una funzione convessa differenziabili dove $ \ phi \ neq S \ subset \ mathbb {R} ^ n $ è un insieme convesso. Considera il problema $ min f \ left (x \ right), x \ in S $, quindi $ \ bar {x} $ è una soluzione ottimale se $ \ bigtriangledown f \ left (\ bar {x} \ right) ^ T \ left (x- \ bar {x} \ right) \ geq 0, \ forall x \ in S. $

Prova

Sia $ \ bar {x} $ una soluzione ottimale, cioè $ f \ left (\ bar {x} \ right) \ leq f \ left (x \ right), \ forall x \ in S $

$ \ Freccia destra f \ sinistra (x \ destra) = f \ sinistra (\ bar {x} \ destra) \ geq 0 $

$ f \ sinistra (x \ destra) = f \ sinistra (\ bar {x} \ destra) + \ bigtriangledown f \ sinistra (\ bar {x} \ destra) ^ T \ sinistra (x- \ bar {x} \ destra) + \ sinistra \ | x- \ bar {x} \ right \ | \ alpha \ left (\ bar {x}, x- \ bar {x} \ right) $

dove $ \ alpha \ left (\ bar {x}, x- \ bar {x} \ right) \ rightarrow 0 $ come $ x \ rightarrow \ bar {x} $

$ \ Freccia destra f \ sinistra (x \ destra) -f \ sinistra (\ bar {x} \ destra) = \ bigtriangledown f \ sinistra (\ bar {x} \ destra) ^ T \ sinistra (x- \ bar {x } \ right) \ geq 0 $

Corollario

Sia f una funzione convessa differenziabile in $ \ bar {x} $, allora $ \ bar {x} $ è il minimo globale se e solo se $ \ bigtriangledown f \ left (\ bar {x} \ right) = 0 $

Esempi

$ f \ sinistra (x \ destra) = \ sinistra (x ^ 2-1 \ destra) ^ {3}, x \ in \ mathbb {R} $.

$ \ bigtriangledown f \ left (x \ right) = 0 \ Rightarrow x = -1,0,1 $.

$ \ bigtriangledown ^ 2f \ sinistra (\ pm 1 \ destra) = 0, \ bigtriangledown ^ 2 f \ sinistra (0 \ destra) = 6> 0 $.

$ f \ sinistra (\ pm 1 \ destra) = 0, f \ sinistra (0 \ destra) = - 1 $

Quindi, $ f \ sinistra (x \ destra) \ geq -1 = f \ sinistra (0 \ destra) \ Freccia destra f \ sinistra (0 \ destra) \ leq f \ sinistra (x \ destra) \ forall x \ in \ mathbb {R} $
$ f \ left (x \ right) = x \ log x $ definito su $ S = \ left \ {x \ in \ mathbb {R}, x> 0 \ right \} $.

$ {f} 'x = 1 + \ log x $

$ {f} '' x = \ frac {1} {x}> 0 $

Pertanto, questa funzione è strettamente convessa.
$ f \ left (x \ right) = e ^ {x}, x \ in \ mathbb {R} $ è strettamente convesso.

Sia $ f: S \ rightarrow \ mathbb {R} $ dove $ S \ subset \ mathbb {R} ^ n $ è un insieme convesso non vuoto. La funzione f si dice quasiconvex se per ogni $ x_1, x_2 \ in S $, abbiamo $ f \ left (\ lambda x_1 + \ left (1- \ lambda \ right) x_2 \ right) \ leq max \ left \ {f \ sinistra (x_1 \ destra), f \ sinistra (x_2 \ destra) \ destra \}, \ lambda \ in \ sinistra (0, 1 \ destra) $

Ad esempio, $ f \ left (x \ right) = x ^ {3} $

Sia $ f: S \ rightarrow R $ dove $ S \ subset \ mathbb {R} ^ n $ è un insieme convesso non vuoto. La funzione f si dice quasiconvex se per ogni $ x_1, x_2 \ in S $, abbiamo $ f \ left (\ lambda x_1 + \ left (1- \ lambda \ right) x_2 \ right) \ geq min \ left \ {f \ sinistra (x_1 \ destra), f \ sinistra (x_2 \ destra) \ destra \}, \ lambda \ in \ sinistra (0, 1 \ destra) $

Osservazioni

Ogni funzione convessa è quasiconvessa ma non è vero il contrario.
Una funzione che è sia quasiconvessa che quasiconcava è chiamata quasimonotone.

Teorema

Sia $ f: S \ rightarrow \ mathbb {R} $ e S sia un insieme convesso non vuoto in $ \ mathbb {R} ^ n $. La funzione f è quasiconvessa se e solo se $ S _ {\ alpha} = \ left (x \ in S: f \ left (x \ right) \ leq \ alpha \ right \} $ è convessa per ogni numero reale \ alpha $

Prova

Sia f quasiconvex su S.

Siano $ x_1, x_2 \ in S _ {\ alpha} $ quindi $ x_1, x_2 \ in S $ e $ max \ left \ {f \ left (x_1 \ right), f \ left (x_2 \ right) \ right \} \ leq \ alpha $

Lascia $ \ lambda \ in \ left (0, 1 \ right) $ e $ x = \ lambda x_1 + \ left (1- \ lambda \ right) x_2 \ leq max \ left \ {f \ left (x_1 \ right) , f \ sinistra (x_2 \ destra) \ destra \} \ Freccia destra x \ in S $

Quindi, $ f \ left (\ lambda x_1 + \ left (1- \ lambda \ right) x_2 \ right) \ leq max \ left \ {f \ left (x_1 \ right), f \ left (x_2 \ right) \ right \} \ leq \ alpha $

Pertanto, $ S _ {\ alpha} $ è convesso.

conversare

Sia $ S _ {\ alpha} $ convesso per ogni $ \ alpha $

$ x_1, x_2 \ in S, \ lambda \ in \ sinistra (0,1 \ destra) $

$ x = \ lambda x_1 + \ left (1- \ lambda \ right) x_2 $

Sia $ x = \ lambda x_1 + \ left (1- \ lambda \ right) x_2 $

Per $ x_1, x_2 \ in S _ {\ alpha}, \ alpha = max \ sinistra \ {f \ sinistra (x_1 \ destra), f \ sinistra (x_2 \ destra) \ destra \} $

$ \ Rightarrow \ lambda x_1 + \ left (1- \ lambda \ right) x_2 \ in S _ {\ alpha} $

$ \ Freccia destra f \ sinistra (\ lambda x_1 + \ sinistra (1- \ lambda \ destra) x_2 \ destra) \ leq \ alpha $

Quindi dimostrato.

Teorema

Sia $ f: S \ rightarrow \ mathbb {R} $ e S sia un insieme convesso non vuoto in $ \ mathbb {R} ^ n $. La funzione f è quasiconcava se e solo se $ S _ {\ alpha} = \ left \ {x \ in S: f \ left (x \ right) \ geq \ alpha \ right \} $ è convesso per ogni numero reale $ \ alpha $.

Teorema

Sia $ f: S \ rightarrow \ mathbb {R} $ e S sia un insieme convesso non vuoto in $ \ mathbb {R} ^ n $. La funzione f è quasimonotone se e solo se $ S _ {\ alpha} = \ left \ {x \ in S: f \ left (x \ right) = \ alpha \ right \} $ è convesso per ogni numero reale $ \ alpha $.

Teorema

Sia S un insieme convesso non vuoto in $ \ mathbb {R} ^ n $ e $ f: S \ rightarrow \ mathbb {R} $ sia differenziabile su S, allora f è quasiconvex se e solo se per ogni $ x_1, x_2 \ in S $ e $ f \ sinistra (x_1 \ destra) \ leq f \ sinistra (x_2 \ destra) $, abbiamo $ \ bigtriangledown f \ sinistra (x_2 \ destra) ^ T \ sinistra (x_2-x_1 \ destra) \ leq 0 $

Prova

Sia f una funzione quasiconvessa.

Siano $ x_1, x_2 \ in S $ tali che $ f \ left (x_1 \ right) \ leq f \ left (x_2 \ right) $

Per differenziabilità di f a $ x_2, \ lambda \ in \ left (0, 1 \ right) $

$ f \ sinistra (\ lambda x_1 + \ sinistra (1- \ lambda \ destra) x_2 \ destra) = f \ sinistra (x_2 + \ lambda \ sinistra (x_1-x_2 \ destra) \ destra) = f \ sinistra (x_2 \ destra ) + \ bigtriangledown f \ sinistra (x_2 \ destra) ^ T \ sinistra (x_1-x_2 \ destra) $

$ + \ lambda \ sinistra \ | x_1-x_2 \ destra \ | \ alpha \ sinistra (x_2, \ lambda \ sinistra (x_1-x_2 \ destra) \ destra) $

$ \ Freccia destra f \ sinistra (\ lambda x_1 + \ sinistra (1- \ lambda \ destra) x_2 \ destra) -f \ sinistra (x_2 \ destra) -f \ sinistra (x_2 \ destra) = \ bigtriangledown f \ sinistra (x_2 \ destra) ^ T \ sinistra (x_1-x_2 \ destra) $

$ + \ lambda \ sinistra \ | x_1-x_2 \ destra \ | \ alpha \ sinistra (x2, \ lambda \ sinistra (x_1-x_2 \ destra) \ destra) $

Ma poiché f è quasiconvex, $ f \ left (\ lambda x_1 + \ left (1- \ lambda \ right) x_2 \ right) \ leq f \ left (x_2 \ right) $

$ \ bigtriangledown f \ sinistra (x_2 \ destra) ^ T \ sinistra (x_1-x_2 \ destra) + \ lambda \ sinistra \ | x_1-x_2 \ destra \ | \ alpha \ sinistra (x_2, \ lambda \ sinistra (x_1, x_2 \ destra) \ destra) \ leq 0 $

Ma $ \ alpha \ left (x_2, \ lambda \ left (x_1, x_2 \ right) \ right) \ rightarrow 0 $ come $ \ lambda \ rightarrow 0 $

Pertanto, $ \ bigtriangledown f \ left (x_2 \ right) ^ T \ left (x_1-x_2 \ right) \ leq 0 $

conversare

lascia per $ x_1, x_2 \ in S $ e $ f \ left (x_1 \ right) \ leq f \ left (x_2 \ right) $, $ \ bigtriangledown f \ left (x_2 \ right) ^ T \ left (x_1, x_2 \ destra) \ leq 0 $

Per mostrare che f è quasiconvex, cioè $ f \ left (\ lambda x_1 + \ left (1- \ lambda \ right) x_2 \ right) \ leq f \ left (x_2 \ right) $

Proof by contradiction

Supponiamo che esista un $ x_3 = \ lambda x_1 + \ left (1- \ lambda \ right) x_2 $ tale che $ f \ left (x_2 \ right) <f \ left (x_3 \ right) $ per qualche $ \ lambda \ in \ sinistra (0, 1 \ destra) $

Per $ x_2 $ e $ x_3, \ bigtriangledown f \ left (x_3 \ right) ^ T \ left (x_2-x_3 \ right) \ leq 0 $

$ \ Rightarrow - \ lambda \ bigtriangledown f \ left (x_3 \ right) ^ T \ left (x_2-x_3 \ right) \ leq 0 $

$ \ Rightarrow \ bigtriangledown f \ left (x_3 \ right) ^ T \ left (x_1-x_2 \ right) \ geq 0 $

Per $ x_1 $ e $ x_3, \ bigtriangledown f \ left (x_3 \ right) ^ T \ left (x_1-x_3 \ right) \ leq 0 $

$ \ Rightarrow \ left (1- \ lambda \ right) \ bigtriangledown f \ left (x_3 \ right) ^ T \ left (x_1-x_2 \ right) \ leq 0 $

$ \ Rightarrow \ bigtriangledown f \ left (x_3 \ right) ^ T \ left (x_1-x_2 \ right) \ leq 0 $

quindi, dalle equazioni precedenti, $ \ bigtriangledown f \ left (x_3 \ right) ^ T \ left (x_1-x_2 \ right) = 0 $

Definisci $ U = \ left \ {x: f \ left (x \ right) \ leq f \ left (x_2 \ right), x = \ mu x_2 + \ left (1- \ mu \ right) x_3, \ mu \ in \ sinistra (0,1 \ destra) \ destra \} $

Quindi possiamo trovare $ x_0 \ in U $ tale che $ x_0 = \ mu_0 x_2 = \ mu x_2 + \ left (1- \ mu \ right) x_3 $ per qualche $ \ mu _0 \ in \ left (0,1 \ right ) $ che è il più vicino a $ x_3 $ e $ \ hat {x} \ in \ left (x_0, x_1 \ right) $ in modo tale che per teorema del valore medio,

$$ \ frac {f \ left (x_3 \ right) -f \ left (x_0 \ right)} {x_3-x_0} = \ bigtriangledown f \ left (\ hat {x} \ right) $$

$$ \ Rightarrow f \ left (x_3 \ right) = f \ sinistra (x_0 \ right) + \ bigtriangledown f \ left (\ hat {x} \ right) ^ T \ left (x_3-x_0 \ right) $$

$$ \ Freccia destra f \ sinistra (x_3 \ destra) = f \ sinistra (x_0 \ destra) + \ mu_0 \ lambda f \ sinistra (\ hat {x} \ destra) ^ T \ sinistra (x_1-x_2 \ destra) $ $

Poiché $ x_0 $ è una combinazione di $ x_1 $ e $ x_2 $ e $ f \ left (x_2 \ right) <f \ left (\ hat {x} \ right) $

Ripetendo la procedura di avvio, $ \ bigtriangledown f \ left (\ hat {x} \ right) ^ T \ left (x_1-x_2 \ right) = 0 $

Quindi, combinando le equazioni precedenti, otteniamo:

$$ f \ sinistra (x_3 \ destra) = f \ sinistra (x_0 \ destra) \ leq f \ sinistra (x_2 \ destra) $$

$$ \ Freccia destra f \ sinistra (x_3 \ destra) \ leq f \ sinistra (x_2 \ destra) $$

Quindi, è una contraddizione.

Esempi

Step 1 - $ f \ sinistra (x \ destra) = X ^ 3 $

$ Sia f \ left (x_1 \ right) \ leq f \ left (x_2 \ right) $

$ \ Rightarrow x_ {1} ^ {3} \ leq x_ {2} ^ {3} \ Rightarrow x_1 \ leq x_2 $

$ \ bigtriangledown f \ left (x_2 \ right) \ left (x_1-x_2 \ right) = 3x_ {2} ^ {2} \ left (x_1-x_2 \ right) \ leq 0 $

Quindi, $ f \ left (x \ right) $ è quasiconvex.

Step 2 - $ f \ sinistra (x \ destra) = x_ {1} ^ {3} + x_ {2} ^ {3} $

Siano $ \ hat {x_1} = \ left (2, -2 \ right) $ e $ \ hat {x_2} = \ left (1, 0 \ right) $

quindi $ f \ left (\ hat {x_1} \ right) = 0, f \ left (\ hat {x_2} \ right) = 1 \ Rightarrow f \ left (\ hat {x_1} \ right) \ setminus <f \ left (\ hat {x_2} \ right) $

Quindi, $ \ bigtriangledown f \ left (\ hat {x_2} \ right) ^ T \ left (\ hat {x_1} - \ hat {x_2} \ right) = \ left (3, 0 \ right) ^ T \ left (1, -2 \ destra) = 3> 0 $

Quindi $ f \ left (x \ right) $ non è quasiconvex.

Siano $ f: S \ rightarrow \ mathbb {R} ^ n $ e S un insieme convesso non vuoto in $ \ mathbb {R} ^ n $ allora f si dice che sia strettamente una funzione quasicovessa se per ogni $ x_1, x_2 \ in S $ con $ f \ left (x_1 \ right) \ neq f \ left (x_2 \ right) $, abbiamo $ f \ left (\ lambda x_1 + \ left (1- \ lambda \ right) x_2 \ right) <max \: \ sinistra \ {f \ sinistra (x_1 \ destra), f \ sinistra (x_2 \ destra) \ destra \} $

Osservazioni

Ogni funzione strettamente quasiconvessa è strettamente convessa.
La funzione rigorosamente quasiconvessa non implica la quasiconvessità.
La funzione strettamente quasiconvessa potrebbe non essere fortemente quasiconvessa.
La funzione pseudoconvessa è una funzione strettamente quasiconvessa.

Teorema

Sia $ f: S \ rightarrow \ mathbb {R} ^ n $ una funzione strettamente quasiconvessa e S sia un insieme convesso non vuoto in $ \ mathbb {R} ^ n $. Considera il problema: $ min \: f \ left (x \ destra), x \ in S $. Se $ \ hat {x} $ è la soluzione ottima locale, allora $ \ bar {x} $ è la soluzione ottimale globale.

Prova

Lascia che esista $ \ bar {x} \ in S $ tale che $ f \ left (\ bar {x} \ right) \ leq f \ left (\ hat {x} \ right) $

Poiché $ \ bar {x}, \ hat {x} \ in S $ e S è un insieme convesso, quindi,

$$ \ lambda \ bar {x} + \ left (1- \ lambda \ right) \ hat {x} \ in S, \ forall \ lambda \ in \ left (0,1 \ right) $$

Poiché $ \ hat {x} $ è il minimo locale, $ f \ left (\ hat {x} \ right) \ leq f \ left (\ lambda \ bar {x} + \ left (1- \ lambda \ right) \ hat {x} \ right), \ forall \ lambda \ in \ left (0, \ delta \ right) $

Poiché f è strettamente quasiconvessa.

$$ f \ left (\ lambda \ bar {x} + \ left (1- \ lambda \ right) \ hat {x} \ right) <max \ left \ {f \ left (\ hat {x} \ right) , f \ sinistra (\ bar {x} \ destra) \ destra \} = f \ sinistra (\ hat {x} \ destra) $$

Quindi, è una contraddizione.

Funzione rigorosamente quasiconcava

Siano $ f: S \ rightarrow \ mathbb {R} ^ n $ e S un insieme convesso non vuoto in $ \ mathbb {R} ^ n $, allora f è saud essere strettamente una funzione quasicovex se per ogni $ x_1, x_2 \ in S $ con $ f \ left (x_1 \ right) \ neq f \ left (x_2 \ right) $, abbiamo

$$ f \ sinistra (\ lambda x_1 + \ sinistra (1- \ lambda \ destra) x_2 \ destra)> min \ sinistra \ {f \ sinistra (x_1 \ destra), f \ sinistra (x_2 \ destra) \ destra \} $$.

Esempi

$ f \ sinistra (x \ destra) = x ^ 2-2 $

È una funzione strettamente quasiconvex perché se prendiamo due punti qualsiasi $ x_1, x_2 $ nel dominio che soddisfano i vincoli nella definizione $ f \ left (\ lambda x_1 + \ left (1- \ lambda \ right) x_2 \ right) <max \ left \ {f \ left (x_1 \ right), f \ left (x_2 \ right) \ right \} $ Poiché la funzione diminuisce nell'asse x negativo e aumenta nell'asse x positivo ( poiché è una parabola).
$ f \ sinistra (x \ destra) = - x ^ 2 $

Non è una funzione strettamente quasiconvex perché se prendiamo $ x_1 = 1 $ e $ x_2 = -1 $ e $ \ lambda = 0,5 $, allora $ f \ left (x_1 \ right) = - 1 = f \ left ( x_2 \ right) $ ma $ f \ left (\ lambda x_1 + \ left (1- \ lambda \ right) x_2 \ right) = 0 $ Pertanto non soddisfa le condizioni indicate nella definizione. Ma è una funzione quasiconcava perché se prendiamo due punti qualsiasi nel dominio che soddisfano i vincoli nella definizione $ f \ left (\ lambda x_1 + \ left (1- \ lambda \ right) x_2 \ right)> min \ left \ {f \ sinistra (x_1 \ destra), f \ sinistra (x_2 \ destra) \ destra \} $. Poiché la funzione aumenta nell'asse x negativo e diminuisce nell'asse x positivo.

Siano $ f: S \ rightarrow \ mathbb {R} ^ n $ e S un insieme convesso non vuoto in $ \ mathbb {R} ^ n $ allora f è una funzione fortemente quasiconvex se per qualsiasi $ x_1, x_2 \ in S $ con $ \ left (x_1 \ right) \ neq \ left (x_2 \ right) $, abbiamo $ f \ left (\ lambda x_1 + \ left (1- \ lambda \ right) x_2 \ right) <max \: \ sinistra \ {f \ sinistra (x_1 \ destra), f \ sinistra (x_2 \ destra) \ destra \}, \ forall \ lambda \ in \ sinistra (0,1 \ destra) $

Teorema

Una funzione quasiconvessa $ f: S \ rightarrow \ mathbb {R} ^ n $ su un insieme convesso non vuoto S in $ \ mathbb {R} ^ n $ è fortemente funzione quasiconvessa se non è costante su un segmento di linea che unisce qualsiasi punti di S.

Prova

Sia f una funzione quasiconvessa e non è costante su un segmento di retta che unisce punti di S.

Supponiamo che f non sia una funzione fortemente quasiconvessa.

Esistono $ x_1, x_2 \ in S $ con $ x_1 \ neq x_2 $ tali che

$$ f \ sinistra (z \ destra) \ geq max \ sinistra \ {f \ sinistra (x_1 \ destra), f \ sinistra (x_2 \ destra) \ destra \}, \ forall z = \ lambda x_1 + \ sinistra (1 - \ lambda \ right) x_2, \ lambda \ in \ left (0,1 \ right) $$

$ \ Freccia destra f \ sinistra (x_1 \ destra) \ leq f \ sinistra (z \ destra) $ e $ f \ sinistra (x_2 \ destra) \ leq f \ sinistra (z \ destra) $

Poiché f non è costante in $ \ left [x_1, z \ right] $ e $ \ left [z, x_2 \ right] $

Quindi esiste $ u \ in \ left [x_1, z \ right] $ e $ v = \ left [z, x_2 \ right] $

$$ \ Freccia destra u = \ mu_1x_1 + \ sinistra (1- \ mu_1 \ destra) z, v = \ mu_2z + \ sinistra (1- \ mu_2 \ destra) x_2 $$

Poiché f è quasiconvesso,

$$ \ Freccia destra f \ sinistra (u \ destra) \ leq max \ sinistra \ {f \ sinistra (x_1 \ destra), f \ sinistra (z \ destra) \ destra \} = f \ sinistra (z \ destra) \ : \: e \: \: f \ sinistra (v \ destra) \ leq max \ sinistra \ {f \ sinistra (z \ destra), f \ sinistra (x_2 \ destra) \ destra \} $$

$$ \ Freccia destra f \ sinistra (u \ destra) \ leq f \ sinistra (z \ destra) \: \: e \: \: f \ sinistra (v \ destra) \ leq f \ sinistra (z \ destra) $ $

$$ \ Freccia destra max \ sinistra \ {f \ sinistra (u \ destra), f \ sinistra (v \ destra) \ destra \} \ leq f \ sinistra (z \ destra) $$

Ma z è qualsiasi punto tra ue v, se qualcuno di essi è uguale, allora f è costante.

Pertanto, $ max \ sinistra \ {f \ sinistra (u \ destra), f \ sinistra (v \ destra) \ destra \} \ leq f \ sinistra (z \ destra) $

che contraddice la quasiconvessità di f come $ z \ in \ sinistra [u, v \ destra] $.

Quindi f è una funzione fortemente quasiconvessa.

Teorema

Siano $ f: S \ rightarrow \ mathbb {R} ^ n $ e S un insieme convesso non vuoto in $ \ mathbb {R} ^ n $. Se $ \ hat {x} $ è una soluzione ottima locale, allora $ \ hat {x} $ è una soluzione ottimale globale unica.

Prova

Poiché una forte funzione quasiconvessa è anche strettamente una funzione quasiconvessa, una soluzione ottimale locale è una soluzione ottimale globale.

Uniqueness - Sia f per ottenere la soluzione ottimale globale in due punti $ u, v \ in S $

$$ \ Freccia destra f \ sinistra (u \ destra) \ leq f \ sinistra (x \ destra). \ Forall x \ in S \: \: e \: \: f \ sinistra (v \ destra) \ leq f \ sinistra (x \ destra). \ forall x \ in S $$

Se u è una soluzione ottimale globale, $ f \ sinistra (u \ destra) \ leq f \ sinistra (v \ destra) $ e $ f \ sinistra (v \ destra) \ leq f \ sinistra (u \ destra) \ Freccia destra f \ sinistra (u \ destra) = f \ sinistra (v \ destra) $

$$ f \ sinistra (\ lambda u + \ sinistra (1- \ lambda \ destra) v \ destra) <max \ sinistra \ {f \ sinistra (u \ destra), f \ sinistra (v \ destra) \ destra \} = f \ sinistra (u \ destra) $$

che è una contraddizione.

Quindi esiste solo una soluzione ottimale globale.

Osservazioni

Una funzione fortemente quasiconvessa è anche strettamente quasiconvessa.
Una funzione strettamente convessa può o non può essere fortemente quasiconvessa.
Un differenziabile strettamente convesso è fortemente quasiconvesso.

Sia $ f: S \ rightarrow \ mathbb {R} $ una funzione differenziabile e S un insieme convesso non vuoto in $ \ mathbb {R} ^ n $, allora f si dice pseudoconvesso se per ogni $ x_1, x_2 \ in S $ con $ \ bigtriangledown f \ left (x_1 \ right) ^ T \ left (x_2-x_1 \ right) \ geq 0 $, abbiamo $ f \ left (x_2 \ right) \ geq f \ left ( x_1 \ right) $, o equivalentemente se $ f \ left (x_1 \ right)> f \ left (x_2 \ right) $ allora $ \ bigtriangledown f \ left (x_1 \ right) ^ T \ left (x_2-x_1 \ right ) <0 $

Funzione pseudoconcava

Sia $ f: S \ rightarrow \ mathbb {R} $ una funzione differenziabile e S un insieme convesso non vuoto in $ \ mathbb {R} ^ n $, allora f è detto pseudoconvesso se per ogni $ x_1, x_2 \ in S $ con $ \ bigtriangledown f \ left (x_1 \ right) ^ T \ left (x_2-x_1 \ right) \ geq 0 $, abbiamo $ f \ left (x_2 \ right) \ leq f \ left ( x_1 \ right) $, o equivalentemente se $ f \ left (x_1 \ right)> f \ left (x_2 \ right) $ allora $ \ bigtriangledown f \ left (x_1 \ right) ^ T \ left (x_2-x_1 \ right )> 0 $

Osservazioni

Se una funzione è sia pseudoconvessa che pseudoconcava, allora è chiamata pseudolineare.
Una funzione convessa differenziabili è anche pseudoconvessa.
Una funzione pseudoconvessa potrebbe non essere convessa. Per esempio,

$ f \ sinistra (x \ destra) = x + x ^ 3 $ non è convesso. Se $ x_1 \ leq x_2, x_ {1} ^ {3} \ leq x_ {2} ^ {3} $

Quindi, $ \ bigtriangledown f \ left (x_1 \ right) ^ T \ left (x_2-x_1 \ right) = \ left (1 + 3x_ {1} ^ {2} \ right) \ left (x_2-x_1 \ right) \ geq 0 $

Inoltre, $ f \ left (x_2 \ right) -f \ left (x_1 \ right) = \ left (x_2-x_1 \ right) + \ left (x_ {2} ^ {3} -x_ {1} ^ {3 } \ right) \ geq 0 $

$ \ Freccia destra f \ sinistra (x_2 \ destra) \ geq f \ sinistra (x_1 \ destra) $

Quindi, è pseudoconvesso.

Una funzione pseudoconvessa è strettamente quasiconvessa. Pertanto, ogni minimo locale di pseudoconvesso è anche minimo globale.

Funzione rigorosamente pseudoconvessa

Sia $ f: S \ rightarrow \ mathbb {R} $ una funzione differenziabile e S un insieme convesso non vuoto in $ \ mathbb {R} ^ n $, allora f si dice pseudoconvesso se per ogni $ x_1, x_2 \ in S $ con $ \ bigtriangledown f \ left (x_1 \ right) ^ T \ left (x_2-x_1 \ right) \ geq 0 $, abbiamo $ f \ left (x_2 \ right)> f \ left (x_1 \ right) $, o equivalentemente se $ f \ left (x_1 \ right) \ geq f \ left (x_2 \ right) $ allora $ \ bigtriangledown f \ left (x_1 \ right) ^ T \ left (x_2-x_1 \ right ) <0 $

Teorema

Sia f una funzione pseudoconvessa e supponiamo che $ \ bigtriangledown f \ left (\ hat {x} \ right) = 0 $ per qualche $ \ hat {x} \ in S $, allora $ \ hat {x} $ è ottimale globale soluzione di f su S.

Prova

Sia $ \ hat {x} $ un punto critico di f, cioè $ \ bigtriangledown f \ left (\ hat {x} \ right) = 0 $

Poiché f è una funzione pseudoconvessa, per $ x \ in S, $ abbiamo

$$ \ bigtriangledown f \ left (\ hat {x} \ right) \ left (x- \ hat {x} \ right) = 0 \ Rightarrow f \ left (\ hat {x} \ right) \ leq f \ left (x \ right), \ forall x \ in S $$

Quindi, $ \ hat {x} $ è la soluzione ottimale globale.

Nota

Se f è una funzione strettamente pseudoconvessa, $ \ hat {x} $ è una soluzione ottimale globale unica.

Teorema

Se f è una funzione pseudoconvessa differenziabile su S, allora f è sia strettamente quasiconvessa che quasiconvessa.

Osservazioni

La somma di due funzioni pseudoconvesse definite su un insieme aperto S di $ \ mathbb {R} ^ n $ potrebbe non essere pseudoconvessa.
Sia $ f: S \ rightarrow \ mathbb {R} $ una funzione quasiconvex e S un sottoinsieme convesso non vuoto di $ \ mathbb {R} ^ n $ allora f è pseudoconvesso se e solo se ogni punto critico è un globale minimi di f su S.
Sia S un sottoinsieme convesso non vuoto di $ \ mathbb {R} ^ n $ e $ f: S \ rightarrow \ mathbb {R} $ una funzione tale che $ \ bigtriangledown f \ left (x \ right) \ neq 0 $ per ogni $ x \ in S $ allora f è pseudoconvesso se e solo se è una funzione quasiconvex.

Esistono quattro tipi di problemi di programmazione convessi:

Step 1 - $ min \: f \ left (x \ right) $, dove $ x \ in S $ e S è un insieme convesso non vuoto in $ \ mathbb {R} ^ n $ e $ f \ left (x \ right ) $ è una funzione convessa.

Step 2 - $ min \: f \ sinistra (x \ destra), x \ in \ mathbb {R} ^ n $ soggetto a

$ g_i \ left (x \ right) \ geq 0, 1 \ leq m_1 $ e $ g_i \ left (x \ right) $ è una funzione convessa.

$ g_i \ left (x \ right) \ leq 0, m_1 + 1 \ leq m_2 $ e $ g_i \ left (x \ right) $ è una funzione concava.

$ g_i \ left (x \ right) = 0, m_2 + 1 \ leq m $ e $ g_i \ left (x \ right) $ è una funzione lineare.

dove $ f \ left (x \ right) $ è una funzione convessa.

Step 3 - $ max \: f \ left (x \ right) $ dove $ x \ in S $ e S è un insieme convesso non vuoto in $ \ mathbb {R} ^ n $ e $ f \ left (x \ right) $ è una funzione concava.

Step 4 - $ min \: f \ left (x \ right) $, dove $ x \ in \ mathbb {R} ^ n $ soggetto a

$ g_i \ left (x \ right) \ geq 0, 1 \ leq m_1 $ e $ g_i \ left (x \ right) $ è una funzione convessa.

$ g_i \ left (x \ right) \ leq 0, m_1 + 1 \ leq m_2 $ e $ g_i \ left (x \ right) $ è una funzione concava.

$ g_i \ left (x \ right) = 0, m_2 + 1 \ leq m $ e $ g_i \ left (x \ right) $ è una funzione lineare.

dove $ f \ left (x \ right) $ è una funzione concava.

Cono di direzione fattibile

Sia S un insieme non vuoto in $ \ mathbb {R} ^ n $ e $ \ hat {x} \ in \: Closure \ left (S \ right) $, quindi il cono della direzione ammissibile di S in $ \ hat {x} $, indicato con D, è definito come $ D = \ left \ {d: d \ neq 0, \ hat {x} + \ lambda d \ in S, \ lambda \ in \ left (0, \ delta \ right), \ delta> 0 \ right \} $

Ogni vettore $ d \ in D $ diverso da zero è chiamato direzione ammissibile.

Per una data funzione $ f: \ mathbb {R} ^ n \ Rightarrow \ mathbb {R} $ il cono della direzione di miglioramento in $ \ hat {x} $ è indicato da F ed è dato da

$$ F = \ left \ {d: f \ left (\ hat {x} + \ lambda d \ right) \ leq f \ left (\ hat {x} \ right), \ forall \ lambda \ in \ left ( 0, \ delta \ right), \ delta> 0 \ right \} $$

Ogni direzione $ d \ in F $ è chiamata direzione di miglioramento o direzione di discesa di f in $ \ hat {x} $

Teorema

Necessary Condition

Considera il problema $ min f \ left (x \ right) $ tale che $ x \ in S $ dove S sia un insieme non vuoto in $ \ mathbb {R} ^ n $. Supponiamo che f sia differenziabile in un punto $ \ hat {x} \ in S $. Se $ \ hat {x} $ è una soluzione ottimale locale, allora $ F_0 \ cap D = \ phi $ dove $ F_0 = \ left \ {d: \ bigtriangledown f \ left (\ hat {x} \ right) ^ T d <0 \ right \} $ e D è un cono di direzione ammissibile.

Sufficient Condition

Se $ F_0 \ cap D = \ phi $ f è una funzione pseudoconvessa in $ \ hat {x} $ ed esiste un vicinato di $ \ hat {x}, N_ \ varepsilon \ left (\ hat {x} \ right) , \ varepsilon> 0 $ tale che $ d = x- \ hat {x} \ in D $ per ogni $ x \ in S \ cap N_ \ varepsilon \ left (\ hat {x} \ right) $, quindi $ \ hat {x} $ è la soluzione ottimale locale.

Prova

Necessary Condition

Sia $ F_0 \ cap D \ neq \ phi $, cioè esiste un $ d \ in F_0 \ cap D $ tale che $ d \ in F_0 $ e $ d \ in D $

Poiché $ d \ in D $, quindi esiste $ \ delta_1> 0 $ tale che $ \ hat {x} + \ lambda d \ in S, \ lambda \ in \ left (0, \ delta_1 \ right). $

Da $ d \ in F_0 $, quindi $ \ bigtriangledown f \ left (\ hat {x} \ right) ^ T d <0 $

Quindi, esiste $ \ delta_2> 0 $ tale che $ f \ left (\ hat {x} + \ lambda d \ right) <f \ left (\ hat {x} \ right), \ forall \ lambda \ in f \ sinistra (0, \ delta_2 \ destra) $

Lascia $ \ delta = min \ left \ {\ delta_1, \ delta_2 \ right \} $

Quindi $ \ hat {x} + \ lambda d \ in S, f \ left (\ hat {x} + \ lambda d \ right) <f \ left (\ hat {x} \ right), \ forall \ lambda \ in f \ sinistra (0, \ delta \ destra) $

Ma $ \ hat {x} $ è la soluzione ottimale locale.

Quindi è contraddizione.

Quindi $ F_0 \ cap D = \ phi $

Sufficient Condition

Sia $ F_0 \ cap D \ neq \ phi $ nd sia f una funzione pseudoconvessa.

Lascia che esista un quartiere di $ \ hat {x}, N _ {\ varepsilon} \ left (\ hat {x} \ right) $ tale che $ d = x- \ hat {x}, \ forall x \ in S \ cap N_ \ varepsilon \ left (\ hat {x} \ right) $

Sia $ \ hat {x} $ non una soluzione ottimale locale.

Quindi, esiste $ \ bar {x} \ in S \ cap N_ \ varepsilon \ left (\ hat {x} \ right) $ tale che $ f \ left (\ bar {x} \ right) <f \ left ( \ hat {x} \ right) $

Per ipotesi su $ S \ cap N_ \ varepsilon \ left (\ hat {x} \ right), d = \ left (\ bar {x} - \ hat {x} \ right) \ in D $

Per pseudoconvesso di f,

$$ f \ left (\ hat {x} \ right)> f \ left (\ bar {x} \ right) \ Rightarrow \ bigtriangledown f \ left (\ hat {x} \ right) ^ T \ left (\ bar {x} - \ hat {x} \ right) <0 $$

$ \ Rightarrow \ bigtriangledown f \ left (\ hat {x} \ right) ^ T d <0 $

$ \ Freccia destra d \ in F_0 $

quindi $ F_0 \ cap D \ neq \ phi $

che è una contraddizione.

Quindi, $ \ hat {x} $ è la soluzione ottimale locale.

Considera il seguente problema: $ min \: f \ left (x \ right) $ dove $ x \ in X $ tale che $ g_x \ left (x \ right) \ leq 0, i = 1,2, ..., m $

$ f: X \ rightarrow \ mathbb {R}, g_i: X \ rightarrow \ mathbb {R} ^ n $ e X è un insieme aperto in $ \ mathbb {R} ^ n $

Sia $ S = \ left \ {x: g_i \ left (x \ right) \ leq 0, \ forall i \ right \} $

Lascia $ \ hat {x} \ in X $, quindi $ M = \ left \ {1,2, ..., m \ right \} $

Sia $ I = \ left \ {i: g_i \ left (\ hat {x} \ right) = 0, i \ in M \ right \} $ dove I è chiamato index set per tutti i vincoli attivi in $ \ hat {x } $

Sia $ J = \ left \ {i: g_i \ left (\ hat {x} \ right) <0, i \ in M \ right \} $ dove J è chiamato index set per tutti i vincoli attivi in $ \ hat {x } $.

Lascia $ F_0 = \ left \ {d \ in \ mathbb {R} ^ m: \ bigtriangledown f \ left (\ hat {x} \ right) ^ T d <0 \ right \} $

Sia $ G_0 = \ left \ {d \ in \ mathbb {R} ^ m: \ bigtriangledown gI \ left (\ hat {x} \ right) ^ T d <0 \ right \} $

oppure $ G_0 = \ left \ {d \ in \ mathbb {R} ^ m: \ bigtriangledown gi \ left (\ hat {x} \ right) ^ T d <0, \ forall i \ in I \ right \} $

Lemma

Se $ S = \ left \ {x \ in X: g_i \ left (x \ right) \ leq 0, \ forall i \ in I \ right \} $ e X è un insieme aperto non vuoto in $ \ mathbb {R } ^ n $. Siano $ \ hat {x} \ in S $ e $ g_i $ diversi in $ \ hat {x}, i \ in I $ e $ g_i $ dove $ i \ in J $ sono continui in $ \ hat {x } $, quindi $ G_0 \ subseteq D $.

Prova

Sia $ d \ in G_0 $

Poiché $ \ hat {x} \ in X $ e X è un insieme aperto, quindi esiste $ \ delta_1> 0 $ tale che $ \ hat {x} + \ lambda d \ in X $ per $ \ lambda \ in \ sinistra (0, \ delta_1 \ destra) $

Inoltre, poiché $ g_ \ hat {x} <0 $ e $ g_i $ sono continui in $ \ hat {x}, \ forall i \ in J $, esiste $ \ delta_2> 0 $, $ g_i \ left (\ hat {x} + \ lambda d \ right) <0, \ lambda \ in \ left (0, \ delta_2 \ right) $

Poiché $ d \ in G_0 $, quindi, $ \ bigtriangledown g_i \ left (\ hat {x} \ right) ^ T d <0, \ forall i \ in I $ quindi esiste $ \ delta_3> 0, g_i \ left (\ hat {x} + \ lambda d \ right) <g_i \ left (\ hat {x} \ right) = 0 $, per $ \ lambda \ in \ left (0, \ delta_3 \ right) i \ in J $

Lascia $ \ delta = min \ left \ {\ delta_1, \ delta_2, \ delta_3 \ right \} $

quindi, $ \ hat {x} + \ lambda d \ in X, g_i \ left (\ hat {x} + \ lambda d \ right) <0, i \ in M $

$ \ Rightarrow \ hat {x} + \ lambda d \ in S $

$ \ Freccia destra d \ in D $

$ \ Rightarrow G_0 \ subseteq D $

Quindi dimostrato.

Teorema

Necessary Condition

Siano $ f $ e $ g_i, i \ in I $, sono diversi in $ \ hat {x} \ in S, $ e $ g_j $ sono continui in $ \ hat {x} \ in S $. Se $ \ hat {x} $ è il minimo locale di $ S $, allora $ F_0 \ cap G_0 = \ phi $.

Sufficient Condition

Se $ F_0 \ cap G_0 = \ phi $ ef è una funzione pseudoconvessa in $ \ left (\ hat {x}, g_i 9x \ right), i \ in I $ sono strettamente funzioni pseudoconvesse su alcuni $ \ varepsilon $ - vicinanze di $ \ hat {x}, \ hat {x} $ è una soluzione ottimale locale.

Remarks

Let $\hat{x}$ be a feasible point such that $\bigtriangledown f\left(\hat{x} \right)=0$, then $F_0 = \phi$. Thus, $F_0 \cap G_0= \phi$ but $\hat{x}$ is not an optimal solution
But if $\bigtriangledown g\left(\hat{x} \right)=0$, then $G_0=\phi$, thus $F_0 \cap G_0= \phi$
Consider the problem : min $f\left(x \right)$ such that $g\left(x \right)=0$.

Since $g\left(x \right)=0$, thus $g_1\left(x \right)=g\left(x \right)<0$ and $g_2\left(x \right)=-g\left(x \right) \leq 0$.

Let $\hat{x} \in S$, then $g_1\left(\hat{x} \right)=0$ and $g_2\left(\hat{x} \right)=0$.

But $\bigtriangledown g_1\left(\hat{x} \right)= - \bigtriangledown g_2\left(\hat{x}\right)$

Thus, $G_0= \phi$ and $F_0 \cap G_0= \phi$.

Necessary Conditions

Theorem

Consider the problem − $min f\left ( x \right )$ such that $x \in X$ where X is an open set in $\mathbb{R}^n$ and let $g_i \left ( x \right ) \leq 0, \forall i =1,2,....m$.

Let $f:X \rightarrow \mathbb{R}$ and $g_i:X \rightarrow \mathbb{R}$

Let $\hat{x}$ be a feasible solution and let f and $g_i, i \in I$ are differentiable at $\hat{x}$ and $g_i, i \in J$ are continuous at $\hat{x}$.

If $\hat{x}$ solves the above problem locally, then there exists $u_0, u_i \in \mathbb{R}, i \in I$ such that $u_0 \bigtriangledown f\left ( \hat{x} \right )+\displaystyle\sum\limits_{i\in I} u_i \bigtriangledown g_i \left ( \hat{x} \right )$=0

where $u_0,u_i \geq 0,i \in I$ and $\left ( u_0, u_I \right ) \neq \left ( 0,0 \right )$

Furthermore, if $g_i,i \in J$ are also differentiable at $\hat{x}$, then above conditions can be written as −

$u_0 \bigtriangledown f\left ( \hat{x}\right )+\displaystyle\sum\limits_{i=1}^m u_i \bigtriangledown g_i\left ( \hat{x} \right )=0$

$u_ig_i\left (\hat{x} \right )$=0

$u_0,u_i \geq 0, \forall i=1,2,....,m$

$\left (u_0,u \right ) \neq \left ( 0,0 \right ), u=\left ( u_1,u_2,s,u_m \right ) \in \mathbb{R}^m$

Remarks

$u_i$ are called Lagrangian multipliers.
The condition that $\hat{x}$ be feasible to the given problem is called primal feasible condition.
The requirement $u_0 \bigtriangledown f\left (\hat{x} \right )+\displaystyle\sum\limits_{i=1}^m u-i \bigtriangledown g_i\left ( x \right )=0$ is called dual feasibility condition.
The condition $u_ig_i\left ( \hat{x} \right )=0, i=1, 2, ...m$ is called complimentary slackness condition. This condition requires $u_i=0, i \in J$
Together the primal feasible condition, dual feasibility condition and complimentary slackness are called Fritz-John Conditions.

Sufficient Conditions

Theorem

If there exists an $\varepsilon$-neighbourhood of $\hat{x}N_\varepsilon \left ( \hat{x} \right ),\varepsilon >0$ such that f is pseudoconvex over $N_\varepsilon \left ( \hat{x} \right )\cap S$ and $g_i,i \in I$ are strictly pseudoconvex over $N_\varepsilon \left ( \hat{x}\right )\cap S$, then $\hat{x}$ is local optimal solution to problem described above. If f is pseudoconvex at $\hat{x}$ and if $g_i, i \in I$ are both strictly pseudoconvex and quasiconvex function at $\hat{x},\hat{x}$ is global optimal solution to the problem described above.

Example

$min \:f\left ( x_1,x_2 \right )=\left ( x_1-3 \right )^2+\left ( x_2-2 \right )^2$

such that $x_{1}^{2}+x_{2}^{2} \leq 5, x_1+2x_2 \leq 4, x_1,x_2 \geq 0$ And $\hat{x}=\left ( 2,1 \right )$

Let $g_1\left (x_1,x_2 \right )=x_{1}^{2}+x_{2}^{2} -5,$

$g_2\left (x_1,x_2 \right )=x_1+2x_2-4,$

$g_3\left (x_1,x_2 \right )=-x_1$ and $g_4\left ( x_1, x_2 \right )= -x_2$.

Thus the above constraints can be written as −

$g_1\left (x_1,x_2 \right )\leq 0,$

$g_2\left (x_1,x_2 \right )\leq 0,$

$g_3\left (x_1,x_2 \right )\leq 0$ and

$g_4\left (x_1,x_2 \right )\leq 0$ Thus, $I = \left \{1,2 \right \}$ therefore, $u_3=0,u_4=0$

$\bigtriangledown f \left (\hat{x} \right )=\left (2,-2 \right ),\bigtriangledown g_1\left (\hat{x} \right )=\left (4,2 \right )$ and $\bigtriangledown g_2\left (\hat{x} \right )=\left (1,2 \right )$

Thus putting these values in the first condition of Fritz-John conditions, we get −

$u_0=\frac{3}{2} u_2, \:\:u_1= \frac{1}{2}u_2,$ and let $u_2=1$, therefore $u_0= \frac{3}{2},\:\:u_1= \frac{1}{2}$

Thus Fritz John conditions are satisfied.
$min f\left (x_1,x_2\right )=-x_1$.

such that $x_2-\left (1-x_1\right )^3 \leq 0$,

$-x_2 \leq 0$ and $\hat{x}=\left (1,0\right )$

Let $g_1\left (x_1,x_2 \right )=x_2-\left (1-x_1\right )^3$,

$g_2\left (x_1,x_2 \right )=-x_2$

Thus the above constraints can be wriiten as −

$g_1\left (x_1,x_2 \right )\leq 0,$

$g_2\left (x_1,x_2 \right )\leq 0,$

Thus, $I=\left \{1,2 \right \}$

$\bigtriangledown f\left (\hat{x} \right )=\left (-1,0\right )$

$\bigtriangledown g_1 \left (\hat{x} \right )=\left (0,1\right )$ and $g_2 \left (\hat{x} \right )=\left (0, -1 \right )$

Thus putting these values in the first condition of Fritz-John conditions, we get −

$u_0=0,\:\: u_1=u_2=a>0$

Thus Fritz John conditions are satisfied.

Consider the problem −

$min \:f\left ( x \right )$ such that $x \in X$, where X is an open set in $\mathbb{R}^n$ and $g_i \left ( x \right )\leq 0, i=1, 2,...,m$

Let $S=\left \{ x \in X:g_i\left ( x \right )\leq 0, \forall i \right \}$

Let $\hat{x} \in S$ and let $f$ and $g_i,i \in I$ are differentiable at $\hat{x}$ and $g_i, i \in J$ are continuous at $\hat{x}$. Furthermore, $\bigtriangledown g_i\left ( \hat{x} \right), i \in I$ are linearly independent. If $\hat{x}$ solves the above problem locally, then there exists $u_i,i \in I$ such that

$\bigtriangledown f\left ( x\right)+\displaystyle\sum\limits_{i\in I} u_i \bigtriangledown g_i\left ( \hat{x} \right)=0$, $\:\:u_i \geq 0, i \in I$

If $g_i,i \in J$ are also differentiable at $\hat{x}$. then $\hat{x}$, then

$\bigtriangledown f\left ( \hat{x}\right)+\displaystyle\sum\limits_{i= 1}^m u_i \bigtriangledown g_i\left ( \hat{x} \right)=0$

$u_ig_i\left ( \hat{x} \right)=0, \forall i=1,2,...,m$

$u_i \geq 0 \forall i=1,2,...,m$

Example

Consider the following problem −

$min \:f\left ( x_1,x_2 \right )=\left ( x_1-3\right )^2+\left ( x_2-2\right )^2$

such that $x_{1}^{2}+x_{2}^{2}\leq 5$,

$x_1,2x_2 \geq 0$ and $\hat{x}=\left ( 2,1 \right )$

Let $g_1\left ( x_1, x_2 \right)=x_{1}^{2}+x_{2}^{2}-5$,

$g_2\left ( x_1, x_2 \right)=x_{1}+2x_2-4$

$g_3\left ( x_1, x_2 \right)=-x_{1}$ and $g_4\left ( x_1,x_2 \right )=-x_2$

Thus the above constraints can be written as −

$g_1 \left ( x_1,x_2 \right)\leq 0, g_2 \left ( x_1,x_2 \right) \leq 0$

$g_3 \left ( x_1,x_2 \right)\leq 0,$ and $g_4 \left ( x_1,x_2 \right) \leq 0$ Thus, $I=\left \{ 1,2 \right \}$ therefore, $ u_3=0,\:\: u_4=0$

$\bigtriangledown f \left ( \hat{x} \right)=\left ( 2,-2 \right), \bigtriangledown g_1 \left ( \hat{x} \right)= \left ( 4,2 \right)$ and

$\bigtriangledown g_2\left ( \hat{x} \right ) =\left ( 1,2 \right )$

Thus putting these values in the first condition of Karush-Kuhn-Tucker conditions, we get −

$u_1=\frac{1}{3}$ and $u_2=\frac{2}{3}$

Thus Karush-Kuhn-Tucker conditions are satisfied.

Method of Steepest Descent

This method is also called Gradient method or Cauchy's method. This method involves the following terminologies −

$$x_{k+1}=x_k+\alpha_kd_k$$

$d_k= - \bigtriangledown f\left ( x_k \right )$ or $ d_k= -\frac{\bigtriangledown f\left ( x_k \right )}{\left \| \bigtriangledown f\left ( x_k \right ) \right \|}$

Let $\phi \left (\alpha \right )=f\left ( x_k +\alpha d_k\right )$

By differentiating $\phi$ and equating it to zero, we can get $\alpha$.

So the algorithm goes as follows −

Initialize $x_0$,$\varepsilon_1$,$\varepsilon_2$ and set $k=0$.
Set $d_k=-\bigtriangledown f\left ( x_k \right ) $or $d_k=-\frac{\bigtriangledown f\left (x_k \right )}{\left \|\bigtriangledown f\left (x_k \right ) \right \|}$.
find $\alpha_k$ such that it minimizes $\phi\left ( \alpha \right )=f\left ( x_k+\alpha d_k \right )$.
Set $x_{k+1}=x_k+\alpha_kd_k$.
If $\left \| x_{k+1-x_k} \right \| <\varepsilon_1$ or $\left \| \bigtriangledown f\left ( x_{k+1} \right ) \right \| \leq \varepsilon_2$, go to step 6, otherwise set $k=k+1$ and go to step 2.
The optimal solution is $\hat{x}=x_{k+1}$.

Newton Method

Newton Method works on the following principle −

$f\left ( x \right )=y\left ( x \right )=f\left ( x_k \right )+\left ( x-x_k \right )^T \bigtriangledown f\left ( x_k \right )+\frac{1}{2}\left ( x-x_k \right )^T H\left ( x_k \right )\left ( x-x_k \right )$

$\bigtriangledown y\left ( x \right )=\bigtriangledown f\left ( x_k \right )+H\left ( x_k \right )\left ( x-x_k \right )$

At $x_{k+1}, \bigtriangledown y\left ( x_{k+1} \right )=\bigtriangledown f\left ( x_k \right )+H\left ( x_k \right )\left ( x_{k+1}-x_k \right )$

For $x_{k+1}$ to be optimal solution $\bigtriangledown y\left ( x_k+1 \right )=0$

Thus, $x_{k+1}=x_k-H\left ( x_k \right )^{-1} \bigtriangledown f\left ( x_k \right )$

Here $H\left ( x_k \right )$ should be non-singular.

Hence the algorithm goes as follows −

Step 1 − Initialize $x_0,\varepsilon$ and set $k=0$.

Step 2 − find $H\left ( x_k \right ) \bigtriangledown f\left ( x_k \right )$.

Step 3 − Solve for the linear system $H\left ( x_k \right )h\left ( x_k \right )=\bigtriangledown f\left ( x_k \right )$ for $h\left ( x_k \right )$.

Step 4 − find $x_{k+1}=x_k-h\left ( x_k \right )$.

Step 5 − If $\left \| x_{k+1}-x_k \right \|< \varepsilon$ or $\left \| \bigtriangledown f\left ( x_k \right ) \right \| \leq \varepsilon$ then go to step 6, else set $k=k+1$ and go to step 2.

Step 6 − The optimal solution is $\hat{x}=x_{k+1}$.

Conjugate Gradient Method

This method is used for solving problems of the following types −

$min f\left ( x \right )=\frac{1}{2}x^T Qx-bx$

where Q is a positive definite nXn matrix and b is constant.

Given $x_0,\varepsilon,$ compute $g_0=Qx_0-b$

Set $d_0=-g_0$ for $k=0,1,2,...,$

Set $\alpha_k=\frac{g_{k}^{T}g_k}{d_{k}^{T}Q d_k}$

Compute $x_{k+1}=x_k+\alpha_kd_k$

Set $g_{k+1}=g_k+\alpha_kd_k$

Compute $\beta_k=\frac{g_{k}^{T}g_k}{d_{k}^{T}Qd_k}$

Compute $x_{k+1}=x_k+\alpha_kd_k$

Set $g_{k+1}=x_k+\alpha_kQd_k$

Compute $\beta_k=\frac{g_{k+1}^{T}g_{k+1}}{g_{k}^{T}gk}$

Set $d_{k+1}=-g_{k+1}+\beta_kd_k$.

Next page ↱