Scala中閉包的實現機制

NO IMAGE
1 Star2 Stars3 Stars4 Stars5 Stars 給文章打分!
Loading...

Scala中閉包的實現機制

版權宣告:本文為博主原創文章,未經博主允許不得轉載。

手動碼字不易,請大家尊重勞動成果,謝謝

作者:http://blog.csdn.net/wang_wbq

本文通過scala程式碼編譯生成的class檔案的角度來對Scala的閉包實現機制進行簡單分析

首先以一個簡單的例子開始:

class ClosureDemo {
def func() = {
var i = 2
val inc: () => Unit = () => i = i   1
val add: Int => Int = (ii: Int) => ii   i
(inc, add)
}
}

在這個程式碼中,incadd引用了func函式中的i變數,由於Scala中函式是頭等值,因此incadd將形成閉包來引用外部的i變數。

編譯上述程式碼我們將得到三個class檔案:

ClosureDemo.class
ClosureDemo$$anonfun$1.class
ClosureDemo$$anonfun$2.class

這三個檔案分別是ClosureDemo類自身和兩個閉包,Scala會為每個閉包生成一個Class檔案,如果巢狀過深,可能會出現特別長的類名,從而在Windows上引起一些路徑過長的錯誤。

在Spark原始碼中的ClosureCleaner類中,我們可以看到這樣的程式碼,用來判斷這個類是不是閉包:

// Check whether a class represents a Scala closure
private def isClosure(cls: Class[_]): Boolean = {
cls.getName.contains("$anonfun$")
}

首先我們使用javap來看下ClosureDemo.class檔案的內容:

{
public scala.Tuple2<scala.Function0<scala.runtime.BoxedUnit>, scala.Function1<java.lang.Object, java.lang.Object>> func();
descriptor: ()Lscala/Tuple2;
flags: ACC_PUBLIC
Code:
stack=4, locals=4, args_size=1
0: iconst_2
1: invokestatic  #16                 // Method scala/runtime/IntRef.create:(I)Lscala/runtime/IntRef;
4: astore_1
5: new           #18                 // class ClosureDemo$$anonfun$1
8: dup
9: aload_0
10: aload_1
11: invokespecial #22                 // Method ClosureDemo$$anonfun$1."<init>":(LClosureDemo;Lscala/runtime/IntRef;)V
14: astore_2
15: new           #24                 // class ClosureDemo$$anonfun$2
18: dup
19: aload_0
20: aload_1
21: invokespecial #25                 // Method ClosureDemo$$anonfun$2."<init>":(LClosureDemo;Lscala/runtime/IntRef;)V
24: astore_3
25: new           #27                 // class scala/Tuple2
28: dup
29: aload_2
30: aload_3
31: invokespecial #30                 // Method scala/Tuple2."<init>":(Ljava/lang/Object;Ljava/lang/Object;)V
34: areturn
LocalVariableTable:
Start  Length  Slot  Name   Signature
0      35     0  this   LClosureDemo;
5      29     1     i   Lscala/runtime/IntRef;
15      19     2   inc   Lscala/Function0;
25       9     3   add   Lscala/Function1;
LineNumberTable:
line 3: 0
line 4: 5
line 5: 15
line 6: 25
Signature: #46                          // ()Lscala/Tuple2<Lscala/Function0<Lscala/runtime/BoxedUnit;>;Lscala/Function1<Ljava/lang/Object;Ljava/lang/Object;>;>;
public ClosureDemo();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #41                 // Method java/lang/Object."<init>":()V
4: return
LocalVariableTable:
Start  Length  Slot  Name   Signature
0       5     0  this   LClosureDemo;
LineNumberTable:
line 8: 0
}

由於其不含欄位表,因此我們重點關注其方法表,從上述class檔案中我滿可以看到它具有兩個方法:

1、func() 我們定義的func函式
2、ClosureDemo() 類建構函式

我們重點關注func函式的實現:
首先將一個int型整數2壓入棧頂,然後呼叫scala.runtime.IntRef類中的靜態函式:create(Int):scala.runtime.IntRef來將之前的2包裝到IntRef類裡,我們來看下IntRef的實現:

package scala.runtime;
public class IntRef implements java.io.Serializable {
private static final long serialVersionUID = 1488197132022872888L;
public int elem;
public IntRef(int elem) { this.elem = elem; }
public String toString() { return java.lang.Integer.toString(elem); }
public static IntRef create(int e) { return new IntRef(e); }
public static IntRef zero() { return new IntRef(0); }
}

程式碼很簡單,只是簡單把這個int型別的變數包裝在了IntRef類裡,這樣這個變數就成功從棧中跑到了堆裡。再之後就是兩個閉包類的構造過程了,其中有一點需要重點關注下,那就是在呼叫這兩個閉包類的建構函式時,傳入了this和剛剛構造好的IntRef

下面我們進入閉包類裡來看下,以下是ClosureDemo$$anonfun$1.class檔案的欄位表和方法表,它是inc編譯後生成的位元組碼:

{
public static final long serialVersionUID;
descriptor: J
flags: ACC_PUBLIC, ACC_STATIC, ACC_FINAL
ConstantValue: long 0l
private final scala.runtime.IntRef i$1;
descriptor: Lscala/runtime/IntRef;
flags: ACC_PRIVATE, ACC_FINAL
public final void apply();
descriptor: ()V
flags: ACC_PUBLIC, ACC_FINAL
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokevirtual #23                 // Method apply$mcV$sp:()V
4: return
LocalVariableTable:
Start  Length  Slot  Name   Signature
0       5     0  this   LClosureDemo$$anonfun$1;
LineNumberTable:
line 4: 0
public void apply$mcV$sp();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=3, locals=1, args_size=1
0: aload_0
1: getfield      #27                 // Field i$1:Lscala/runtime/IntRef;
4: aload_0
5: getfield      #27                 // Field i$1:Lscala/runtime/IntRef;
8: getfield      #33                 // Field scala/runtime/IntRef.elem:I
11: iconst_1
12: iadd
13: putfield      #33                 // Field scala/runtime/IntRef.elem:I
16: return
LocalVariableTable:
Start  Length  Slot  Name   Signature
0      17     0  this   LClosureDemo$$anonfun$1;
LineNumberTable:
line 4: 0
public final java.lang.Object apply();
descriptor: ()Ljava/lang/Object;
flags: ACC_PUBLIC, ACC_FINAL, ACC_BRIDGE, ACC_SYNTHETIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokevirtual #36                 // Method apply:()V
4: getstatic     #42                 // Field scala/runtime/BoxedUnit.UNIT:Lscala/runtime/BoxedUnit;
7: areturn
LocalVariableTable:
Start  Length  Slot  Name   Signature
0       8     0  this   LClosureDemo$$anonfun$1;
LineNumberTable:
line 4: 0
public ClosureDemo$$anonfun$1(ClosureDemo, scala.runtime.IntRef);
descriptor: (LClosureDemo;Lscala/runtime/IntRef;)V
flags: ACC_PUBLIC
Code:
stack=2, locals=3, args_size=3
0: aload_0
1: aload_2
2: putfield      #27                 // Field i$1:Lscala/runtime/IntRef;
5: aload_0
6: invokespecial #46                 // Method scala/runtime/AbstractFunction0$mcV$sp."<init>":()V
9: return
LocalVariableTable:
Start  Length  Slot  Name   Signature
0      10     0  this   LClosureDemo$$anonfun$1;
0      10     1 $outer   LClosureDemo;
0      10     2   i$1   Lscala/runtime/IntRef;
LineNumberTable:
line 4: 0
}

從上述程式碼中我們可以看到,其含有兩個欄位和四個方法:

public static final long serialVersionUID=0L;
private final scala.runtime.IntRef i$1;
public final void apply()
public void apply$mcV$sp()
public final java.lang.Object apply()
public ClosureDemo$$anonfun$1(ClosureDemo, scala.runtime.IntRef)

我們先從建構函式看起,之前分析ClosureDemo.class時我們看到在構造兩個閉包時,傳入了外部類的this引用的IntRef,正時呼叫的這個建構函式。這個建構函式很簡單,把第二個引數IntRef存到了類欄位i$1裡,這個IntRef就是包裝了2這個數字的類引用。之後呼叫其父類scala/runtime/AbstractFunction0$mcV$sp的建構函式。這個類名還是很有意思的,從我幾次試驗來看,它具有以下規律:
1、前半部分scala/runtime/AbstractFunction0AbstractFunction0代表函式的引數型別,0代表沒有引數,AbstractFunctionX代表X個引數等。它繼承了對應的FunctionX父類。
2、後半部分$mcV$sp中的V代表了函式的返回值是Void型別,舉個Scala原始碼中的例子:boolean apply$mcZIJ$sp(int v1, long v2);

我們再看上面class檔案中剩餘的幾個方法,兩個apply方法,其中一個只是為了相容老版本而生成的方法(ACC_BRIDGE, ACC_SYNTHETIC),另一個僅僅直接呼叫apply$mcV$sp方法。因此我們重點來看下apply$mcV$sp方法的實現。程式碼也十分簡單:
1、取類欄位i$1到棧中
2、取IntRef的elem欄位值,即IntRef所包裝的值
3、將其加1並寫回該IntRef類中

由於IntRef為堆中的類,因此所有其他引用了該IntRef類的欄位都將看到該數字被加1(不考慮多執行緒)

ClosureDemo$$anonfun$2.class中的程式碼和ClosureDemo$$anonfun$1.class中一致,只是僅僅返回了IntRef中值與輸入的Int之和。由於在構造ClosureDemo$$anonfun$1ClosureDemo$$anonfun$2時傳入的是同一個IntRef,因此當它們對應的incadd被外部呼叫時,其操作的數字為同一個數字,看上去就還像操作func方法中的i變數一樣。這樣incadd就實現了包含外部變數i的閉包。

不知大家是否注意到,在構造這兩個閉包時,建構函式裡傳入了外包裝的類物件,但是在這個例子中,我們看到它並沒有被使用,並且它的名字很奇特,叫$outer。下面我們對例子稍微改造下:

class ClosureDemo {
def func() = {
def i = 2
val j = 3
var k = 4
val add: Int => Int = (ii: Int) => ii   i   j   k
k = k   1
add
}
}

編譯後會生成兩個檔案:

ClosureDemo.class
ClosureDemo$$anonfun$1.class

我們還是先來看ClosureDemo.class檔案:

{
public scala.Function1<java.lang.Object, java.lang.Object> func();
descriptor: ()Lscala/Function1;
flags: ACC_PUBLIC
Code:
stack=5, locals=4, args_size=1
0: iconst_3
1: istore_1
2: iconst_4
3: invokestatic  #16                 // Method scala/runtime/IntRef.create:(I)Lscala/runtime/IntRef;
6: astore_2
7: new           #18                 // class ClosureDemo$$anonfun$1
10: dup
11: aload_0
12: iload_1
13: aload_2
14: invokespecial #22                 // Method ClosureDemo$$anonfun$1."<init>":(LClosureDemo;ILscala/runtime/IntRef;)V
17: astore_3
18: aload_2
19: aload_2
20: getfield      #26                 // Field scala/runtime/IntRef.elem:I
23: iconst_1
24: iadd
25: putfield      #26                 // Field scala/runtime/IntRef.elem:I
28: aload_3
29: areturn
LocalVariableTable:
Start  Length  Slot  Name   Signature
0      30     0  this   LClosureDemo;
2      27     1     j   I
7      22     2     k   Lscala/runtime/IntRef;
18      11     3   add   Lscala/Function1;
LineNumberTable:
line 4: 0
line 5: 2
line 6: 7
line 7: 18
line 8: 28
Signature: #43                          // ()Lscala/Function1<Ljava/lang/Object;Ljava/lang/Object;>;
public final int ClosureDemo$$i$1();
descriptor: ()I
flags: ACC_PUBLIC, ACC_FINAL
Code:
stack=1, locals=1, args_size=1
0: iconst_2
1: ireturn
LocalVariableTable:
Start  Length  Slot  Name   Signature
0       2     0  this   LClosureDemo;
LineNumberTable:
line 3: 0
public ClosureDemo();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #38                 // Method java/lang/Object."<init>":()V
4: return
LocalVariableTable:
Start  Length  Slot  Name   Signature
0       5     0  this   LClosureDemo;
LineNumberTable:
line 10: 0
}

由於我們在func方法中定義了i函式,因此生成了一個叫做ClosureDemo$$i$1的方法。我們首先看下val jvar k兩個變數的處理方式:
1、由於j是val修飾,因此它直接作為Int型別變數傳入了ClosureDemo$$anonfun$1的建構函式裡
2、由於k是var修飾,因此它被包裝到了IntRef裡並傳入ClosureDemo$$anonfun$1的建構函式裡,關注下後面對k加1的操作,它也是基於IntRef這個包裝進行的。

之後我們來看下ClosureDemo$$anonfun$1.class檔案:

{
public static final long serialVersionUID;
descriptor: J
flags: ACC_PUBLIC, ACC_STATIC, ACC_FINAL
ConstantValue: long 0l
private final ClosureDemo $outer;
descriptor: LClosureDemo;
flags: ACC_PRIVATE, ACC_FINAL, ACC_SYNTHETIC
private final int j$1;
descriptor: I
flags: ACC_PRIVATE, ACC_FINAL
private final scala.runtime.IntRef k$1;
descriptor: Lscala/runtime/IntRef;
flags: ACC_PRIVATE, ACC_FINAL
public final int apply(int);
descriptor: (I)I
flags: ACC_PUBLIC, ACC_FINAL
Code:
stack=2, locals=2, args_size=2
0: aload_0
1: iload_1
2: invokevirtual #27                 // Method apply$mcII$sp:(I)I
5: ireturn
LocalVariableTable:
Start  Length  Slot  Name   Signature
0       6     0  this   LClosureDemo$$anonfun$1;
0       6     1    ii   I
LineNumberTable:
line 6: 0
public int apply$mcII$sp(int);
descriptor: (I)I
flags: ACC_PUBLIC
Code:
stack=2, locals=2, args_size=2
0: iload_1
1: aload_0
2: getfield      #32                 // Field $outer:LClosureDemo;
5: invokevirtual #36                 // Method ClosureDemo.ClosureDemo$$i$1:()I
8: iadd
9: aload_0
10: getfield      #38                 // Field j$1:I
13: iadd
14: aload_0
15: getfield      #40                 // Field k$1:Lscala/runtime/IntRef;
18: getfield      #45                 // Field scala/runtime/IntRef.elem:I
21: iadd
22: ireturn
LocalVariableTable:
Start  Length  Slot  Name   Signature
0      23     0  this   LClosureDemo$$anonfun$1;
0      23     1    ii   I
LineNumberTable:
line 6: 0
public final java.lang.Object apply(java.lang.Object);
descriptor: (Ljava/lang/Object;)Ljava/lang/Object;
flags: ACC_PUBLIC, ACC_FINAL, ACC_BRIDGE, ACC_SYNTHETIC
Code:
stack=2, locals=2, args_size=2
0: aload_0
1: aload_1
2: invokestatic  #52                 // Method scala/runtime/BoxesRunTime.unboxToInt:(Ljava/lang/Object;)I
5: invokevirtual #54                 // Method apply:(I)I
8: invokestatic  #58                 // Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer;
11: areturn
LocalVariableTable:
Start  Length  Slot  Name   Signature
0      12     0  this   LClosureDemo$$anonfun$1;
0      12     1    v1   Ljava/lang/Object;
LineNumberTable:
line 6: 0
public ClosureDemo$$anonfun$1(ClosureDemo, int, scala.runtime.IntRef);
descriptor: (LClosureDemo;ILscala/runtime/IntRef;)V
flags: ACC_PUBLIC
Code:
stack=2, locals=4, args_size=4
0: aload_1
1: ifnonnull     6
4: aconst_null
5: athrow
6: aload_0
7: aload_1
8: putfield      #32                 // Field $outer:LClosureDemo;
11: aload_0
12: iload_2
13: putfield      #38                 // Field j$1:I
16: aload_0
17: aload_3
18: putfield      #40                 // Field k$1:Lscala/runtime/IntRef;
21: aload_0
22: invokespecial #65                 // Method scala/runtime/AbstractFunction1$mcII$sp."<init>":()V
25: return
LocalVariableTable:
Start  Length  Slot  Name   Signature
0      26     0  this   LClosureDemo$$anonfun$1;
0      26     1 $outer   LClosureDemo;
0      26     2   j$1   I
0      26     3   k$1   Lscala/runtime/IntRef;
LineNumberTable:
line 6: 0
StackMapTable: number_of_entries = 1
frame_type = 6 /* same */
}

從上述程式碼中我們可以看到,其含有四個欄位和四個方法:

public static final long serialVersionUID=0L;
private final ClosureDemo $outer
private final int j$1;
private final scala.runtime.IntRef k$1
public final int apply(int)
public int apply$mcII$sp(int)
public final java.lang.Object apply(java.lang.Object)
public ClosureDemo$$anonfun$1(ClosureDemo, int, scala.runtime.IntRef)

我們還是從建構函式開始入手,它先檢測了第一個入參是否是null,如果是null則丟擲空指標異常,否則將其存入類的$outer欄位裡。之後將j: Intk: IntRef存入類的j$1k$1欄位裡。

由於apply方法只是簡單呼叫apply$mcII$sp(int)方法,因此我們繼續分析apply$mcII$sp(int)。首先它呼叫了ClosureDemo類的ClosureDemo$$i$1方法取i的值,然後取Int型別的j$1的值,再取IntRef型別的k$1中的elem值,將它們加在一起返回。

從這個例子我們可以看出:
1、閉包呼叫外部方法會把外層類物件存在該閉包的$outer欄位中,並在使用到該函式時用$outer進行invokevirtual呼叫
2、閉包呼叫外部val變數時,僅僅把該變數存在對應名稱的欄位中,在使用時直接取值
3、閉包呼叫外部var變數時,如果變數為值(AnyVal)型別,則會建立對應的Ref物件將其包裹並存在欄位中,如果為引用型別(AnyRef),則會建立ObjectRef物件來包裹。在使用時取其elem欄位來取它的原始值。

在本篇部落格中,只介紹了一層包裝的閉包。在Scala中還可以實現很多層包裝的閉包,與一層包裝的區別僅僅在於每一層閉包會在需要時將其最近的一層外包裝物件的儲存在其$outer欄位裡,有興趣可以自己構造以下來看看其class檔案。

相關文章

程式語言 最新文章