NO IMAGE
1 Star2 Stars3 Stars4 Stars5 Stars 給文章打分!
Loading...
Akka 簡介
曾經用過scala的一個爬蟲框架scrala 那個框架就是基於Akka的 ,見
scrala 初探(一)
其畢竟是處理分散式 及 並行問題的一大利器
故應該有一個大致的瞭解  
首先對一些基本概念展開介紹
Concurrency vs Parallelism
Concurrency 是指不一定同時啟動的任務同時進行 Parallelism指同時進行。
Asynchronous vs Synchronous
Akka 本質上是Asychronous
下面引用《Akka Essentials》中的部分內容對Akka進行一個大體上的介紹
The driving force of Akka's Actor Model 
The existing,Java-based concurrency model dose not lend well to
 the underlying, handware multiprocesser model. This leads to the java 
application not being able to scale out, to handle the demands of a distributed 
scalable concurrent application.
The Akka framework has been taken the "Actor Model" concept to build an 
event-driven, middleware framework that allows the building of concurrent,
scalable, and distributed systems. Akka uses the Actor Model to raise the 
abstraction level that decouples the business logic from the low-level 
constructs of threads, locks and non-locking IO.
The Akka framework provides the following features:
Concurrency: The Akka Actor Model abstracts concurrency handling and 
allows the programmer to focus on the business logic 
Scalability: The Akka Actor Model's asynchronous message passing allows 
applications to scale up on multicore servers.
Fault tolerance: Akka borrows the concepts and techniques from Erlang to 
build the "Let It Crash", fault tolerance model.
Trnsaction support: Akka implements transactors that combine the actors 
and sofeware transactional memory(STM) into transactional actors
Location transparency: Akka provides a unified programing model for multicore 
and distributed computing needs
Scala/Java APIs: Akka supports both Java and Scala APIs for building applications.
The Akka framework is envisioned(預想) as a toolkit and runtime for building highly
concurrent, distributed, and fault-tolerant, event-driven applications on the JVM.
Concurrent systems 
When writing large concurrent systems, the traditional model of shared state concurrency 
makes use of changing shared memory locations.(改變共享記憶體的地址) The system uses 
multithreaded programming coupled with synchonization monitors to guard against potential
deadlocks.The multithreading programming model is based on how to manage and control the 
concurrent access to the shared, mutable state.
Manipulationg shared, mutable state via threads makes it hard at times to debug problems.
Usage of locks may guarantee the correct behavior, but it is likly to lead to the effect 
of threads running into a deadlock problem, with each acquiring locks in a different order 
and waiting for each other.(死鎖,兩個執行緒同時等待對方釋放自己要操作的資源,從而進入迴圈等待)
Working with threads requires a much higher level of programming skills and it is 
very difficult to predict the behavior of the threads in a runtims environment.
Java provides shared memory threads with locks as the primary form of concurrency 
abstractions.However,shared memory threads are quite heavyweight and incur severe performance 
penalties from context-switching overheads. (Java 原定操縱方法有複雜度及效能問題)
Container-based applications 
Java Platform, Enterprise Edition(JEE)was introduced as a platfrom to develop and run distributed 
multiter Java applications. The entire multiter architecture is based on the concept of breaking 
down the application into specialized layers that process the smaller pieces of logic. These multiter
applications are deployed to containers (called application servers) provided by venders(買方)
such as IBM or Oracle, which host and provide the infrastructure(深部構造) to run the application.
The applicaton server is tuned (調節) to run the application and utilize(利用) the underlying hardware.
The container-based model allows the application to be distributed across nodes
and allows them to be scaled, The runtime model of the application servers has its 
own shared of issues, as follows:
In case of runtime failures, the entire request call fails,It is very difficult 
retry any method execution or recovery from failures.
The application scalability is tagged to the underlying application container 
settings.An application cannot make use of different threading models to account 
for different workloads within the same application.
Using the contained-based model to scale out the applications requires a large 
set of resources, and overheads of managing the application across the application 
server nodes are very high.
The JEE programming model of the writing distributed application is not the best fit for 
a scale-out application model.
Actor Model 
The Actor Model takes a different approach solving the problem of concurrency, by avoiding 
the issue caused by threads and locks, In the Actor Model, all objects are modeled as independent 
computational entitiies that only responed to the message received. There is no shared state 
between actors
Actors change their atate only when they receive a simulus in the form of a message.So unlike the 
object-oriented world where the objects are executed sequentially.the actors execute concurrently.
The Actor Model is based on the following priciple:
The immutable message are used to communicate between actors. Actors do not shaare state,
and if any information is shared ,it is done via message only.Actors control the access 
to the state and nobody else can access the state.This means there is no shared, mutable
state.
Each actor has a queue attached where the incoming messages are enqueued.Message are pickle 
from the queue and processed by the actor, one at a time.An actor can respond to received 
message by sending immutable message to other actors, creating a new set of actors , updating
their own satte, or designating the computational logic to be used when the next message 
arrives(behavior change)
Messages are passed between actors asynchronously. It means that the sender does not wait 
for message to be received and can go back to its execution immediately.Any actor can send 
a message to another actor with no guarantee on the sequence of the message arrival and executon.
Conmmunication between the sender and receiver is decoupled and asynchronous, allowing them
to execute in different threads. By having invocation and execution in separate threads 
couple with no shared state, allow actors to provide a concurrent and scalable model.
Actor systems
Actor is an independent, concurrent computational entity that responds to messages.
Before we jump into actor, we need to understand the role played by actor in the overall scheme(方案)
of things. Actor is the smallest unit in the grand scheme of things.
Concurrent programs are split into separate entity that work on distinct subtasks.Each actor perform 
his quota of tasks(subtasks) and when all the actors have finished their individual subtasks,
the bigger task gets completed.
The actor system is the container that manages the actor behavior,lifecycle,hierachy,and configuration 
among other things. The actor system provides the structure to manage the application.
What is an actor? 
Actor is modeled as the object that encapsulates(概括) state and behavior.All the messages 
intended for the actors are parked in a queue and actors process the message from that queue.
Actors can change their state and behavior based on the message passed.This allows them to responed 
to changes in the message coming in. An actor has the constituents (成分) that are listed in
the following sections.
State
The actor objects hold instance variables that have certain state values or can be pure computational 
entities(stateless). There state value held by the actor instance variable define the state of the actor .
The state can be characterized by counters, listeners, or references to resources or state machine. The 
actor state is changed only as a response to a message. The whole premise(假定) of the actor is to prevent 
the actor state getting corrupted or locked via concurrent access to the state variables.
Behavior
Behavior is nothing but the computation logic that needs to be executed in the messsage received.
The actor behavior itself can undergo a change as a reaction to the message.It means the actor can swap 
the existing behavior with a new behavior when a certain message comes in. The actor defaults to the 
original behavior in the case of a restart, when encountering a failure;(遇到錯誤,預設重起)
Mailbox 
An actor responds to messages. The connection wire between the sender sending the a message and the 
receiver actor receiving the message is called the mailbox. Every actor is attached to exactly one 
mailbox. When the message is sent to the actor, the message gets enqueueed in its mailboxs , from the 
message is dequeued for processing by the receiving actor.The order of arrival of the message in the 
queue is determined in runtime based on the time order of the send operation. Messages from one 
sender actor to another definite receiver actor will be enqueued in the same order as they are sent
Akka provides multiple mailbox implementations. The mailboxes can be bounded(阻塞版本) or unbounded. A bounded 
mailbox limits the number of messages that can be queued in the mailbox, meaning it has a defined or 
fixed capacity for holding the messages.
At times, applications may want to prioritize a certain message over the other. To handle such cases,
Akka provides a priority mailbox where the messages are enqueued based on the assigned priority. Akka
does not allow scanning of the mailbox.(不允許對MailBox的讀取) Message are processed in the same order
 as they are enqueued in the mailbox.
Akka makes use of dispatchers to pass the messages from the queue to the actors for processing. Akka
supports different types of the dispatchers. 
Actor lifecycle 
Every actor that is defined and created has an associated lifecycle. Akka provides hooks such as preStart
that allow the actor's state and behavior to be initalized. When the actor is stopped, Akka disables the 
message queuing for the actor before PostStop is invoked. In the postStop hook,any persistence of the state 
or clean up of any hold-up resources can be done.
Futher, Akka supports two types of actors-untyped actors and typed actors.
Fault tolerannce 
Akka follows the premise of the actor hierachy where we have specialized actors that are adapt in handling 
or performing an activity.To manage these specialized actors, we have supervisor actors that coordinate and 
manage their lifestyle. As the complexity of the problem grows, the hierachy also expands to manage the comlexity.
This allows the system to be as simple or as required based on the tasks that need to be performed
The whole idea is to break down the task into smaller tasks to the point where the task is granular(顆粒狀) and
structured enough to be performed by one actor.Each actor knows which kind of message it will process
and how he reacts in terms of failure. So, if the actor does not know how to handle a particular message
or an abnormal runtime behavior, the actor asks its supervisor(tell(getSender)) for help. The recursive actor 
hierachy allows the problem to be propagated upwards to the point where it can be handled. Remember, every 
actor in Akka has one and only one supervisor.
This actor hierachy forms the basis of the Akka "Let It Crash" fault-tolerance model. Akka's fault-torlerance
model is built using the actor hierachy and supervisor model. 
 在 erlang 裡,let it crash 是指程式設計師不必過分擔心未知的錯誤,而進行面面俱到的 defensive coding。相反,當這種錯誤來臨時,
任由錯誤所處的上下文 —— 一般是某個 process —— 崩潰退出。當 process 退出後,它會將這種狀態彙報給 monitor process,
由其決定如何來進一步處理這個錯誤。
Location transparancy
For a distributed application, all actor interactions need to be asynchronous and location transparent.
Meaning, location of the actor(local or remote) has no impact on the application. Whether we are accessing 
an actor,or invoking or passing the message, everything remains the same.
To achieve this location transparecy, the actord need to be identifable and reachable. Under the hood,(在實現上)
Akka uses configuration to indicate whether the actor is running locally or on a remote machine.Akka uses 
the actor hierachy and combines it with the actor system addresses to make each other identifiable and reachable.
Akka uses the same philosophy of the WWW to identify and locate resources on the Web.WWW makes use of the uniform 
resource locator (URL) to identify and locate resources on the Web. The URL consists of-scheme://domain:port/path ,
where scheme define the protool (HTTP or FTP),domain defines the server name or the IP address, port defines the port 
where the proces listens for incoming requests, and path spcifies the resource to be fatched.
Akka uses the similar URL convention to lcate the actors. In case of an Akka application, the default values are 
akka://hostname/ or akka://hostname:2552/ depending upon whether the application uses the remote actors or not,
to identify the resource within the application, the actor hierachy is used to identify the location of the actor.
The actor hierachy allows the unique path to be created to reach any actor within the actor system. This unique 
path coupled with the address creates a unique address that identifies and locates an actor.
Within the application, each actor is accessed using an ActorRef class, which is based on the underlying actor 
path. ActorRef allows us to transparently access the actors without knowing their locations. Meaning, the location 
of the actor is transparent for the application. The location transparency allows you to build application without 
worrying how the actors aommunicate underneath.
(Akka treats remote and local actors the same--all can be accessed by an address URL)
Transactors(事務處理器)
To provide transaction capabilities to actors, Akka transactors combine actors with STM to form transactional
actors. This allows actors to compose atomic message flows with automatic retry and rollback.
(支援自動重試及回滾)
Working with threads and locks is hard and there is no guarantee that the application will not run into locking issues.
To abstract the threading and locking hardships, STM, which is a concurrency mechanism for managing access to 
shared memory in a concurrent environment , has gained a lot of acceptance.
STM is modeled on similar lines of database transaction handling, In the case of STM, the Java heap is transactional
data set with begin/commit and rollback constructures. As the objects hold the state in memory, the transaction 
only implements the characteristics-atomicity, consistency and isolation.
For actors to implenment a shared state model and provide a consistent, stable view of the state across the calling 
components , Akka transactors provide the way.Akka transactors combine the Actor Model and STM to provide the best 
of both world allowing you to write transactional, asynchronous, event-based message flow applications and gives 
you composed atomic arbitrary, deep message flows.
(這裡提到的事務處理 實際是為了任務狀態恢復的一致性而進行的 相應的類似概念可參見SQL 事務或 Redis pipe)
Some concepts of Akka 
An actor is a computation unit with state, behaviror, and its own mailbox 
There are two types of actors-untyped and typed
Conmmunicate between actors can be asynchronous or synchronous 
Message passing to the actors happens using dispatchers
Actors are organized in a hierarchy via the actor system
Actors are proxied via ActorRef
Supervisor actors are used to build the fault-tolerance mechanism
Actor path follows the URL scheme, which enables location transparency
STM is used to provide transactional support to multiple actor state updates
Akka use cases
Any business use case that requires the application to scale up and scale out, be fault 
tolerant, or provide High Avaliablity, requires massive concurency/paralleism, which is a prime target 
for use of the Akka Actor Model. 
Transaction processing:
This includes processing large data streams, where the incoming data is either time series or 
transactional data. The stream pumps (泵) in large amount of data that needs to be processing 
in parallel and concurrently. The output of the data processing might be used in real time or 
might be fed into analytical system.
Service providers:
Another area is where the application provides services to various other clients via variety of 
service means such as SOAP, REST, Cometd or WebSockets. The application generally caters to massive 
amout of stateless requests that need to be processed fast and concurrently/
Batch processing:(批處理)
Batch processing used across enterprise domains ia another area where Akka shines very well.Dealing with 
large data, applying paradigms such as divide and conquer, map-reduce, master-worker, and grid computing 
allows massive data to be processed. The data might be coming in via real-time feeds, or it might be 
unstructured data (coming via logfiles) or data read from exists data stores.
Data miniing/analytics/Business Intelligence
Most enterprises generate large amounts of data-structured as well as unstructured.Application that 
mine this data from existing transactional stores or data warehouses(倉庫) can use Akka to process and analyze
these massive data.
Services gateways/hubs(閘道器/核心)
Apps requiring concurrency/parallelism
下面舉初步簡單例子
用到的一些類
UntypedActor
其為基 trait 在實現Actor Model的語義時需要被繼承
其作為Actor Scala的java 類似實現
這一個抽象基類,子類是一個MDB-style utyped actor
這些類有一些 良心定義的life-cycle
Running 生成、啟動actor 可以用來接收message
Shutdown (when 'stop' or 'exit' is invoked)
可以通過getSelf()得到 ActorRef(可以想到 這個類繼承了Scala公共基類 AnyRef)
當得到這個物件後可以通過 這個類實現關於message的傳輸 如 forward tell等方法。
可以通過getSender()得到現在的message sender
可以通過getContex()得到UntypedActorContext(其較原生的ActorContext的不同是定義了相應的java API)
唯一的抽象方法 是 onReceived() 對於每一個proccessed message進行呼叫
除非使用getContext().become()進行動態過載。
Actor Model(此部分摘自wiki)
一種平行計算的數學模型,是平行計算的全域性基元。
作為對收到message的response
一個actor可以 make local decisions,
create more actors ,
send more message
determine how to response to the next message receive
actor 可以修改私有狀態 但是僅僅可以通過message影響 each other.(避免需要引進lock)
actor model採取了萬物皆actor的概念 類似於萬物皆物件
actor是可以並行地根據接受的message 採取如下並行response
send a finite number of message to other actors
create a finite number of new actors
designated the behavior to be used for the next message it rteceives
這裡沒有對於上述任務執行順序的要求 它們可以parallel進行。
此模型的解耦特徵是一大優點。(通過非同步計算及message通訊完成)
message接受者(recipient)由address唯一識別 (有時被稱為mailig address)
因此actor 僅僅可以與其擁有address的actor進行通訊。
這些address可以從其獲得的messsage中獲得,或者獲得其創造的actor的address
actor model的特徵由如下決定
在actors間繼承平行計算的特性,
動態創造actors
在messages間包含 addresses
通過直接的非同步messages傳遞 (對於message到達的順序沒有限制)
Props 是創造actors的配置類
可以將其看做一個不可變的 故可以包含相應的描述資訊。(which dispatcher to use)
Props的一個基本作用在於 生成Actor例項
下面給出生成Actor例項並進行簡單通訊的例子,基本使用到了上面提到的一些內容
轉載自
https://my.oschina.net/xinxingegeya/blog/366447

Actor 類

package akkaTest2;
/**
* Created by admin on 2017/3/9.
*/
import akka.actor.UntypedActor;
public class MyActor extends UntypedActor {
private int x;
private int y;
public MyActor(int x, int y){
this.x = x;
this.y = y;
}
@Override
public void onReceive(Object message)throws Exception{
System.out.println("message="   message);
int result = x   y;
this.getSender().tell(result, this.getSelf());
this.getContext().stop(this.getSelf());
}
}

用Props在一個Actor中對上述Actor進行初始化 並自己shutown的簡單示例

package akkaTest2;
/**
* Created by admin on 2017/3/9.
*/
import akka.actor.ActorRef;
import akka.actor.ActorSystem;
import akka.actor.Props;
import akka.actor.UntypedActor;
public class HelloWorld {
public static class StartActor extends UntypedActor{
@Override
public void preStart()throws Exception{
final ActorRef child = getContext().actorOf(Props.create(MyActor.class, 4, 5), "myChild");
child.tell("good morning", this.getSelf());
}
@Override
public void onReceive(Object message)throws Exception{
System.out.println("result="   message);
this.getContext().stop(this.getSelf());
}
}
public static void main(String []args){
ActorSystem system = ActorSystem.create("myActorSystem");
system.actorOf(Props.create(StartActor.class), "helloworld");
}
}

上面使用Props的方法是直接對Actor類進行建構函式初始化
下面是在中間借用Creator模板初始化的方法 (模板初始化的類竟然不用提供預設建構函式)
僅需要在Creator中定義create方法 並返回相應需要的Actor例項即可
Creator

package akkaTest2;
/**
* Created by admin on 2017/3/9.
*/
import akka.japi.Creator;
public class MyActorCreater implements Creator<MyActor> {
private int x;
private int y;
public MyActorCreater(int x, int y){
this.x = x;
this.y = y;
}
@Override
public MyActor create()throws Exception{
return new MyActor(x, y);
}
}

呼叫Props的Actor

package akkaTest2;
/**
* Created by admin on 2017/3/9.
*/
import akka.actor.ActorRef;
import akka.actor.ActorSystem;
import akka.actor.Props;
import akka.actor.UntypedActor;
public class HelloWorldCreater {
public static class StartActor extends UntypedActor{
@Override
public void preStart()throws Exception{
final ActorRef child = getContext().actorOf(Props.create(new MyActorCreater(4, 5)), "myChild");
child.tell("good morning", this.getSelf());
}
@Override
public void onReceive(Object message)throws Exception{
System.out.println("result="   message);
this.getContext().stop(this.getSelf());
}
}
public static void main(String []args){
ActorSystem system = ActorSystem.create("myActorSystem");
system.actorOf(Props.create(StartActor.class), "helloworld");
}
}

下面所謂推薦的方法(這是在文件原文提到的)
就是將上述實現中的Creator定義為匿名內部類
Actor

package akkaTest2;
/**
* Created by admin on 2017/3/9.
*/
import akka.actor.Props;
import akka.actor.UntypedActor;
import akka.japi.Creator;
public class MyActor2 extends UntypedActor {
private int x;
private int y;
MyActor2(int x, int y){
this.x = x;
this.y = y;
}
public static Props props(final int x, final int y)throws Exception{
return Props.create(new Creator<MyActor2>(){
private static final long serialVersionUID = 1L;
@Override
public MyActor2 create()throws Exception{
return new MyActor2(x, y);
}
});
}
@Override
public void onReceive(Object message){
System.out.println("message="   message);
int result = x   y;
this.getSender().tell(result, this.getSelf());
this.getContext().stop(this.getSelf());
}
}

主類

package akkaTest2;
/**
* Created by admin on 2017/3/9.
*/
import akka.actor.ActorRef;
import akka.actor.ActorSystem;
import akka.actor.Props;
import akka.actor.UntypedActor;
public class HelloWordl2 {
public static class StartActor extends UntypedActor{
@Override
public void preStart()throws Exception{
final ActorRef child = getContext().actorOf(MyActor2.props(4, 7), "myChild");
child.tell("good morning", this.getSelf());
}
@Override
public void onReceive(Object message)throws Exception{
System.out.println("reault="  message);
this.getContext().stop(this.getSelf());
}
}
public static void main(String[]args){
ActorSystem system = ActorSystem.create("myActorSystem");
system.actorOf(Props.create(StartActor.class), "helloworld");
}
}
上述內容 並不需要 很多對於基本方法的解釋就可以進行使用 
如tell 是很直觀的 但是 有一些細節 如 當呼叫tell方法 對於 tell 
的物件(即在上述tell函式的呼叫過程中的第二個引數) 當省略這個引數時 
如果在設定的 接受actors中 呼叫getSender()訪問傳送物件時 會顯示 akka://default/deadLetters
即其來源為deadLetters 
deadLetters 預設是用於處理並不能正確傳遞到actors的messages 
所以有一種常用的使用tell的方法 是對於不能識別的(使用 instanceof 並不能判斷所屬型別的)
使用省略sender的tell方法。

AkkaCrawler 翻譯(一)中給出了Akka在爬蟲中的應用。

在機器學習中的應用可參見 Akka 在Bagging投票演算法中的簡單應用

相關文章

程式語言 最新文章