注意(WARNING):本文含有大量AOSP源码,阅读过程中如出现头晕、目眩、恶心、犯困等症状属正常情况,作者本人亦无法避免症状产生,故不承担任何法律责任。
本文所贴源码全部来自 Android API 29 Platform,即 Android 10.0。
阅读本文需要有一定的C/C++基础。
要想启动一个应用程序,首先要保证这个应用程序所需要的应用程序进程已经启动。
在上两篇文章中我们分别讨论了系统关键进程的启动和Activity的启动流程,已经清楚了两个事实:
应用程序进程由Zygote通过fork自身来创建。
Activity启动时将会优先检查应用程序对应进程是否已经存在并且正在运行。
也就是说整个过程至少有SystemServer、Zygote和目标App进程三个进程参与。
下面将会分为三个部分来介绍,分别是AMS发送创建App进程请求 、Zygote接收请求并创建App进程 、App进程初始化 。
AMS发送创建App进程请求 在上篇文章 中,ActivityStackSupervisor在执行startSpecificActivityLocked方法时,将会检查App进程是否正在执行,我们从这里继续。
ActivityStackSupervisor#startSpecificActivityLocked
frameworks/base/services/core/java/com/android/server/wm/ActivityStackSupervisor.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 void startSpecificActivityLocked (ActivityRecord r, boolean andResume, boolean checkConfig) { final WindowProcessController wpc = mService.getProcessController(r.processName, r.info.applicationInfo.uid); try { ··· final Message msg = PooledLambda.obtainMessage( ActivityManagerInternal::startProcess, mService.mAmInternal, r.processName, r.info.applicationInfo, knownToBeDead, "activity" , r.intent.getComponent()); mService.mH.sendMessage(msg); } finally { Trace.traceEnd(TRACE_TAG_ACTIVITY_MANAGER); } }
此处的mService为ActivityTaskManagerService对象,调用它的getProcessController返回一个WindowProcessController对象,当目标进程未启动时为null。
如果wpc为null,那么就会使用PooledLambda#obtainMessage来获得一个Message,使用主线程Handler将这条消息发送到主线程中。
PooledLambda这个类采用了池化技术,用于构造可回收复用的匿名函数。其acquire函数将会尝试在对象池中
会Kotlin的同学对于双冒号获取函数引用的语法比较熟悉,在Java中这个特性是在JDK 1.8才出现,ActivityManagerInternal::startProcess这行代码实际上返回的是一个Comsumer对象。
PooledLambda#obtainMessage的源码如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 static <A, B, C, D, E, F> Message obtainMessage ( HexConsumer<? super A, ? super B, ? super C, ? super D, ? super E, ? super F> function, A arg1, B arg2, C arg3, D arg4, E arg5, F arg6) { synchronized (Message.sPoolSync) { PooledRunnable callback = acquire(PooledLambdaImpl.sMessageCallbacksPool, function, 6 , 0 , ReturnType.VOID, arg1, arg2, arg3, arg4, arg5, arg6, null , null , null ); return Message.obtain().setCallback(callback.recycleOnUse()); } }
这段代码实际上就是把ActivityManagerInternal#startProcess方法封装成Callback然后设置到Message中。
由于Handler的callback不为空,那么在分发消息时将会直接执行此Callback,在此例中将会执行PooledLambdaImpl#run方法,继而执行到HexConsumer#invoke方法,从而执行了我们传递过去的方法引用。
下面我们继续分析ActivityManagerInternal#startProcess方法。
ActivityManagerInternal#startProcess ActivityManagerInternal类是一个抽象类,startProcess也是一个抽象方法,实际由其子类ActivityManagerService$LocalService来实现。
LocalService是ActivityManagerService的一个内部类,来看看其startProcess方法:
frameworks/base/services/core/java/com/android/server/am/ActivityManagerService$LocalService.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 @Override public void startProcess (String processName, ApplicationInfo info, boolean knownToBeDead, String hostingType, ComponentName hostingName) { try { synchronized (ActivityManagerService.this ) { startProcessLocked(processName, info, knownToBeDead, 0 , new HostingRecord(hostingType, hostingName), false , false , true ); } } finally { ... } }
方法中主要是对后续操作加锁,并将hostingType和hostingName封装到HostingRecord对象中。
HostingRecord类用于描述进程的启动信息,这里的hostingType可以是activity、service、broadcast、content provider,这里为“activity”,hostingName是对应的组件名ComponentName。
接着调用了ActivityManagerService中的startProcessLocked方法。
ActivityManagerService#startProcessLocked
frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java
1 2 3 4 5 6 7 8 9 final ProcessRecord startProcessLocked (String processName, ApplicationInfo info, boolean knownToBeDead, int intentFlags, HostingRecord hostingRecord, boolean allowWhileBooting, boolean isolated, boolean keepIfLarge) { return mProcessList.startProcessLocked(processName, info, knownToBeDead, intentFlags, hostingRecord, allowWhileBooting, isolated, 0 , keepIfLarge, null , null , null , null ); }
将启动进程任务转发给了mProcessList,mProcessList是一个ProcessList对象:
1 2 3 4 final ProcessList mProcessList = new ProcessList();
ProcessList是ActivityManager中用于管理进程的类,这个类对于我们理解Android的进程优先级ADJ算法 有着重要作用,而理解ADJ可以帮助我们提升应用的存活时间(耍流氓),这部分将会在后面单独出文章讨论。
下面继续跟踪App进程启动。
ProcessList#startProcessLocked
frameworks/base/services/core/java/com/android/server/am/ProcessList.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 @GuardedBy("mService") final ProcessRecord startProcessLocked (String processName, ApplicationInfo info, boolean knownToBeDead, int intentFlags, HostingRecord hostingRecord, boolean allowWhileBooting, boolean isolated, int isolatedUid, boolean keepIfLarge, String abiOverride, String entryPoint, String[] entryPointArgs, Runnable crashHandler) { long startTime = SystemClock.elapsedRealtime(); ProcessRecord app; if (!isolated) { app = getProcessRecordLocked(processName, info.uid, keepIfLarge); if ((intentFlags & Intent.FLAG_FROM_BACKGROUND) != 0 ) { if (mService.mAppErrors.isBadProcessLocked(info)) { return null ; } } else { mService.mAppErrors.resetProcessCrashTimeLocked(info); if (mService.mAppErrors.isBadProcessLocked(info)) { EventLog.writeEvent(EventLogTags.AM_PROC_GOOD, UserHandle.getUserId(info.uid), info.uid, info.processName); mService.mAppErrors.clearBadProcessLocked(info); if (app != null ) { app.bad = false ; } } } } else { app = null ; } if (app != null && app.pid > 0 ) { if ((!knownToBeDead && !app.killed) || app.thread == null ) { app.addPackage(info.packageName, info.longVersionCode, mService.mProcessStats); return app; } ProcessList.killProcessGroup(app.uid, app.pid); mService.handleAppDiedLocked(app, true , true ); } if (app == null ) { app = newProcessRecordLocked(info, processName, isolated, isolatedUid, hostingRecord); if (app == null ) { return null ; } app.crashHandler = crashHandler; app.isolatedEntryPoint = entryPoint; app.isolatedEntryPointArgs = entryPointArgs; } else { app.addPackage(info.packageName, info.longVersionCode, mService.mProcessStats); } if (!mService.mProcessesReady && !mService.isAllowedWhileBooting(info) && !allowWhileBooting) { if (!mService.mProcessesOnHold.contains(app)) { mService.mProcessesOnHold.add(app); } return app; } final boolean success = startProcessLocked(app, hostingRecord, abiOverride); return success ? app : null ; }
此方法主要是处理目标进程对应的ProcessRecord对象,ProcessRecord是AMS中用来保存进程信息的类,类似于我们上篇文章中的ActivityRecord。
接着又调用了本类中的startProcessLocked方法。
ProcessList#startProcessLocked
frameworks/base/services/core/java/com/android/server/am/ProcessList.java
1 2 3 4 5 final boolean startProcessLocked (ProcessRecord app, HostingRecord hostingRecord, String abiOverride) { return startProcessLocked(app, hostingRecord, false , false , abiOverride); }
此方法主要是配置了默认参数,继续跟进。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 boolean startProcessLocked (ProcessRecord app, HostingRecord hostingRecord, boolean disableHiddenApiChecks, boolean mountExtStorageFull, String abiOverride) { ··· mService.mProcessesOnHold.remove(app); ··· try { try { final int userId = UserHandle.getUserId(app.uid); AppGlobals.getPackageManager().checkPackageStartable(app.info.packageName, userId); } catch (RemoteException e) { throw e.rethrowAsRuntimeException(); } ··· final String entryPoint = "android.app.ActivityThread" ; return startProcessLocked(hostingRecord, entryPoint, app, uid, gids, runtimeFlags, mountExternal, seInfo, requiredAbi, instructionSet, invokeWith, startTime); } catch (RuntimeException e) { mService.forceStopPackageLocked(app.info.packageName, UserHandle.getAppId(app.uid), false , false , true , false , false , app.userId, "start failure" ); return false ; } }
此方法源码比较长,这里进行了大量删减,它的主要工作是设置App进程挂载外部空间的模式、设置进程的GID,计算RuntimeFlags等。
这里有一行关键代码:
1 final String entryPoint = "android.app.ActivityThread" ;
在这里设置了进程的入口函数所在类文件。继续跟进。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 @GuardedBy("mService") boolean startProcessLocked (HostingRecord hostingRecord, String entryPoint, ProcessRecord app, int uid, int [] gids, int runtimeFlags, int mountExternal, String seInfo, String requiredAbi, String instructionSet, String invokeWith, long startTime) { app.pendingStart = true ; app.killedByAm = false ; app.removed = false ; app.killed = false ; if (app.startSeq != 0 ) { Slog.wtf(TAG, "startProcessLocked processName:" + app.processName + " with non-zero startSeq:" + app.startSeq); } if (app.pid != 0 ) { Slog.wtf(TAG, "startProcessLocked processName:" + app.processName + " with non-zero pid:" + app.pid); } final long startSeq = app.startSeq = ++mProcStartSeqCounter; app.setStartParams(uid, hostingRecord, seInfo, startTime); app.setUsingWrapper(invokeWith != null || SystemProperties.get("wrap." + app.processName) != null ); mPendingStarts.put(startSeq, app); if (mService.mConstants.FLAG_PROCESS_START_ASYNC) { mService.mProcStartHandler.post(() -> { try { final Process.ProcessStartResult startResult = startProcess(app.hostingRecord, entryPoint, app, app.startUid, gids, runtimeFlags, mountExternal, app.seInfo, requiredAbi, instructionSet, invokeWith, app.startTime); synchronized (mService) { handleProcessStartedLocked(app, startResult, startSeq); } } catch (RuntimeException e) { synchronized (mService) { mPendingStarts.remove(startSeq); app.pendingStart = false ; mService.forceStopPackageLocked(app.info.packageName, UserHandle.getAppId(app.uid), false , false , true , false , false , app.userId, "start failure" ); } } }); return true ; } else { try { final Process.ProcessStartResult startResult = startProcess(hostingRecord, entryPoint, app, uid, gids, runtimeFlags, mountExternal, seInfo, requiredAbi, instructionSet, invokeWith, startTime); handleProcessStartedLocked(app, startResult.pid, startResult.usingWrapper, startSeq, false ); } catch (RuntimeException e) { app.pendingStart = false ; mService.forceStopPackageLocked(app.info.packageName, UserHandle.getAppId(app.uid), false , false , true , false , false , app.userId, "start failure" ); } return app.pid > 0 ; } }
这里针对AMS采用同步或者异步的启动方式做了一些工作,最终都会调用startProcess方法。
ProcessList#startProcess
frameworks/base/services/core/java/com/android/server/am/ProcessList.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 private Process.ProcessStartResult startProcess (HostingRecord hostingRecord, String entryPoint, ProcessRecord app, int uid, int [] gids, int runtimeFlags, int mountExternal, String seInfo, String requiredAbi, String instructionSet, String invokeWith, long startTime) { try { final Process.ProcessStartResult startResult; if (hostingRecord.usesWebviewZygote()) { startResult = startWebView(entryPoint, app.processName, uid, uid, gids, runtimeFlags, mountExternal, app.info.targetSdkVersion, seInfo, requiredAbi, instructionSet, app.info.dataDir, null , app.info.packageName, new String[] {PROC_START_SEQ_IDENT + app.startSeq}); } else if (hostingRecord.usesAppZygote()) { final AppZygote appZygote = createAppZygoteForProcessIfNeeded(app); startResult = appZygote.getProcess().start(entryPoint, app.processName, uid, uid, gids, runtimeFlags, mountExternal, app.info.targetSdkVersion, seInfo, requiredAbi, instructionSet, app.info.dataDir, null , app.info.packageName, false , new String[] {PROC_START_SEQ_IDENT + app.startSeq}); } else { startResult = Process.start(entryPoint, app.processName, uid, uid, gids, runtimeFlags, mountExternal, app.info.targetSdkVersion, seInfo, requiredAbi, instructionSet, app.info.dataDir, invokeWith, app.info.packageName, new String[] {PROC_START_SEQ_IDENT + app.startSeq}); } return startResult; } finally { Trace.traceEnd(Trace.TRACE_TAG_ACTIVITY_MANAGER); } }
此方法主要是判断应该由哪个Zygote进程来创建我们的App进程。
一般情况下,我们不指定Zygote进程时,HostingRecord中使用默认Zygote。
1 2 3 public HostingRecord (String hostingType, ComponentName hostingName) { this (hostingType, hostingName, REGULAR_ZYGOTE); }
Process#start方法把进程的启动工作转发给了ZygoteProcess,ZygoteProcess#start又调用了startViaZygote方法,我们直接来到startViaZygote方法中。
ZygoteProcess#startViaZygote
frameworks/base/core/java/android/os/ZygoteProcess.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 private Process.ProcessStartResult startViaZygote (@NonNull final String processClass, @Nullable final String niceName, final int uid, final int gid, @Nullable final int [] gids, int runtimeFlags, int mountExternal, int targetSdkVersion, @Nullable String seInfo, @NonNull String abi, @Nullable String instructionSet, @Nullable String appDataDir, @Nullable String invokeWith, boolean startChildZygote, @Nullable String packageName, boolean useUsapPool, @Nullable String[] extraArgs) throws ZygoteStartFailedEx { ArrayList<String> argsForZygote = new ArrayList<>(); argsForZygote.add("--runtime-args" ); argsForZygote.add("--setuid=" + uid); argsForZygote.add("--setgid=" + gid); argsForZygote.add("--runtime-flags=" + runtimeFlags); ··· synchronized (mLock) { return zygoteSendArgsAndGetResult(openZygoteSocketIfNeeded(abi), useUsapPool, argsForZygote); } }
该过程主要工作是生成argsForZygote数组,该数组保存了进程的uid、pid、groups、targetSDK、niceName等一系列参数。
ZygoteProcess#zygoteSendArgsAndGetResult
frameworks/base/core/java/android/os/ZygoteProcess.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 private Process.ProcessStartResult zygoteSendArgsAndGetResult ( ZygoteState zygoteState, boolean useUsapPool, @NonNull ArrayList<String> args) throws ZygoteStartFailedEx { ··· if (useUsapPool && mUsapPoolEnabled && canAttemptUsap(args)) { try { return attemptUsapSendArgsAndGetResult(zygoteState, msgStr); } catch (IOException ex) { Log.e(LOG_TAG, "IO Exception while communicating with USAP pool - " + ex.getMessage()); } } return attemptZygoteSendArgsAndGetResult(zygoteState, msgStr); }
这里出现了一个新名词:UsapPool 。UsapPool是Android 10中的一种新机制,主要就是在Zygote里面维护一个进程池 ,不过android 10中并没有开启此功能,也就是mUsapPoolEnabled参数默认时为false的。
ZygoteProcess#attemptZygoteSendArgsAndGetResult
frameworks/base/core/java/android/os/ZygoteProcess.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 private Process.ProcessStartResult attemptZygoteSendArgsAndGetResult ( ZygoteState zygoteState, String msgStr) throws ZygoteStartFailedEx { try { final BufferedWriter zygoteWriter = zygoteState.mZygoteOutputWriter; final DataInputStream zygoteInputStream = zygoteState.mZygoteInputStream; zygoteWriter.write(msgStr); zygoteWriter.flush(); Process.ProcessStartResult result = new Process.ProcessStartResult(); result.pid = zygoteInputStream.readInt(); result.usingWrapper = zygoteInputStream.readBoolean(); if (result.pid < 0 ) { throw new ZygoteStartFailedEx("fork() failed" ); } return result; } catch (IOException ex) { zygoteState.close(); Log.e(LOG_TAG, "IO Exception while communicating with Zygote - " + ex.toString()); throw new ZygoteStartFailedEx(ex); } }
这个方法的主要功能是通过socket通道向Zygote进程发送一个参数列表,然后进入阻塞等待状态,直到远端的socket服务端发送回来新创建的进程pid才返回。
到这里SystemServer所做的前置工作就完成了。
小结 我们来回顾一下本节中SystemServer的主要工作。
SystemServer在启动一个Activity时,会判断Activity对应进程是否存在,不存在则会启动该进程。
进程启动代码由ActivityManagerService$LocalService开始,接着将会转发到ProcessList中。
ProcessList会检查进程是否是孤立进程,尝试获取现有的ProcessRecord对象,并且尝试复用进程。
如果无法复用进程,那么将会继续配置参数启动该进程,其中配置了目标进程的入口类:
1 final String entryPoint = "android.app.ActivityThread" ;
最终会通过Socket的方式将参数发送到Zygote中,并且阻塞等待返回。
时序图如下所示。
Zygote接收请求并创建App进程 在之前的系统进程启动过程 一文中我们知道,Zygote在创建完SystemServer后,将会开启无限循环以等待子进程的请求,具体的流程已经在文中分析过了,这里简单过一遍。
ZygoteServer#runSelectLoop frameworks/base/core/java/com/android/internal/os/ZygoteServer.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 Runnable runSelectLoop (String abiList) { ··· while (--pollIndex >= 0 ) { if ((pollFDs[pollIndex].revents & POLLIN) == 0 ) { continue ; } if (pollIndex == 0 ) { ZygoteConnection newPeer = acceptCommandPeer(abiList); peers.add(newPeer); socketFDs.add(newPeer.getFileDescriptor()); } else if (pollIndex < usapPoolEventFDIndex) { try { ZygoteConnection connection = peers.get(pollIndex); final Runnable command = connection.processOneCommand(this ); ··· }
ZygoteConnection#processOneCommand
frameworks/base/core/java/com/android/internal/os/ZygoteConnection.java
1 2 3 4 5 6 7 8 9 10 11 12 13 Runnable processOneCommand (ZygoteServer zygoteServer) { ... pid = Zygote.forkAndSpecialize(parsedArgs.mUid, parsedArgs.mGid, parsedArgs.mGids, parsedArgs.mRuntimeFlags, rlimits, parsedArgs.mMountExternal, parsedArgs.mSeInfo, parsedArgs.mNiceName, fdsToClose, fdsToIgnore, parsedArgs. mStartChildZygote, parsedArgs.mInstructionSet, parsedArgs.mAppDataDir, parsedArgs .mTargetSdkVersion); ... }
现在我们来看看进程fork是如何做的。
Zygote#forkAndSpecialize 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 public static int forkAndSpecialize (int uid, int gid, int [] gids, int runtimeFlags, int [][] rlimits, int mountExternal, String seInfo, String niceName, int [] fdsToClose, int [] fdsToIgnore, boolean startChildZygote, String instructionSet, String appDataDir, int targetSdkVersion) { ZygoteHooks.preFork(); resetNicePriority(); int pid = nativeForkAndSpecialize( uid, gid, gids, runtimeFlags, rlimits, mountExternal, seInfo, niceName, fdsToClose, fdsToIgnore, startChildZygote, instructionSet, appDataDir); ··· ZygoteHooks.postForkCommon(); return pid; }
此方法将fork分为了三步:preFork、nativeForkAndSpecialize和postForkCommon,其中nativeForkAndSpecialize为真正执行fork的步骤。
ZygoteHooks#preFork
libcore/dalvik/src/main/java/dalvik/system/ZygoteHooks.java
1 2 3 4 5 6 7 8 public static void preFork () { Daemons.stop(); token = nativePreFork(); waitUntilAllThreadsStopped(); }
由于Dalvik虚拟机的存在,Zygote中除了主线程外还运行了几条守护线程:
分别是:
Java堆内存整理线程:HeapTaskDaemon
引用队列处理线程:ReferenceQueueDaemon
执行finalize()方法的析构线程:FinalizerDaemon
析构方法监控线程:FinalizerWatchdogDaemon
在fork前需要将这些守护线程停止。
Daemons#stop 1 2 3 4 5 6 7 8 9 10 11 12 private static final Daemon[] DAEMONS = new Daemon[] { HeapTaskDaemon.INSTANCE, ReferenceQueueDaemon.INSTANCE, FinalizerDaemon.INSTANCE, FinalizerWatchdogDaemon.INSTANCE, };public static void stop () { for (Daemon daemon : DAEMONS) { daemon.stop(); } }
可以看到与上图中的是一致的。
nativePreFork 此方法通过JNI最终调用到了dalvik_system_ZygoteHooks.cc中的ZygoteHooks_nativePreFork方法:
1 2 3 4 5 6 7 8 9 10 static jlong ZygoteHooks_nativePreFork (JNIEnv* env, jclass) { Runtime* runtime = Runtime::Current(); CHECK(runtime->IsZygote()) << "runtime instance not started with -Xzygote" ; runtime->PreZygoteFork(); return reinterpret_cast <jlong>(ThreadForEnv(env)); }
PreZygoteFork定义在runtime.cc中:
art/runtime/runtime.cc
1 2 3 4 5 6 7 8 9 void Runtime::PreZygoteFork () { if (GetJit() != nullptr ) { GetJit()->PreZygoteFork(); } heap_->PreZygoteFork(); }
这里不再深入到虚拟机中。
waitUntilAllThreadsStopped 1 2 3 4 5 6 7 private static void waitUntilAllThreadsStopped () { File tasks = new File("/proc/self/task" ); while (tasks.list().length > 1 ) { Thread.yield(); } }
此方法会等待当前进程的线程数量为1才会退出。
ZygoteHooks.preFork()的主要工作便是停止Zygote的4个Daemon子线程的运行,等待并确保Zygote是单线程,并等待这些线程的停止,初始化gc堆的工作,并将线程转换为long型并保存到token中,以便后续能恢复准备线程。
Zygote#nativeForkAndSpecialize nativeForkAndSpecialize通过JNI最终调用如下方法:
frameworks/base/core/jni/com_android_internal_os_Zygote.cpp
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 static jint com_android_internal_os_Zygote_nativeForkAndSpecialize ( JNIEnv* env, jclass, jint uid, jint gid, jintArray gids, jint runtime_flags, jobjectArray rlimits, jint mount_external, jstring se_info, jstring nice_name, jintArray managed_fds_to_close, jintArray managed_fds_to_ignore, jboolean is_child_zygote, jstring instruction_set, jstring app_data_dir) { jlong capabilities = CalculateCapabilities(env, uid, gid, gids, is_child_zygote); if (UNLIKELY(managed_fds_to_close == nullptr )) { ZygoteFailure(env, "zygote" , nice_name, "Zygote received a null fds_to_close vector." ); } std ::vector <int > fds_to_close = ExtractJIntArray(env, "zygote" , nice_name, managed_fds_to_close).value(); std ::vector <int > fds_to_ignore = ExtractJIntArray(env, "zygote" , nice_name, managed_fds_to_ignore) .value_or(std ::vector <int >()); std ::vector <int > usap_pipes = MakeUsapPipeReadFDVector(); fds_to_close.insert(fds_to_close.end(), usap_pipes.begin(), usap_pipes.end()); fds_to_ignore.insert(fds_to_ignore.end(), usap_pipes.begin(), usap_pipes.end()); fds_to_close.push_back(gUsapPoolSocketFD); if (gUsapPoolEventFD != -1 ) { fds_to_close.push_back(gUsapPoolEventFD); fds_to_ignore.push_back(gUsapPoolEventFD); } pid_t pid = ForkCommon(env, false , fds_to_close, fds_to_ignore); if (pid == 0 ) { SpecializeCommon(env, uid, gid, gids, runtime_flags, rlimits, capabilities, capabilities, mount_external, se_info, nice_name, false , is_child_zygote == JNI_TRUE, instruction_set, app_data_dir); } return pid; }
真正执行fork的过程在ForkCommon方法和SpecializeCommon方法中。
ForkCommon 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 static pid_t ForkCommon (JNIEnv* env, bool is_system_server, const std ::vector <int >& fds_to_close, const std ::vector <int >& fds_to_ignore) { SetSignalHandlers(); auto fail_fn = std ::bind(ZygoteFailure, env, is_system_server ? "system_server" : "zygote" , nullptr , _1); BlockSignal(SIGCHLD, fail_fn); __android_log_close(); stats_log_close(); if (gOpenFdTable == nullptr ) { gOpenFdTable = FileDescriptorTable::Create(fds_to_ignore, fail_fn); } else { gOpenFdTable->Restat(fds_to_ignore, fail_fn); } android_fdsan_error_level fdsan_error_level = android_fdsan_get_error_level(); pid_t pid = fork(); if (pid == 0 ) { PreApplicationInit(); DetachDescriptors(env, fds_to_close, fail_fn); ClearUsapTable(); gOpenFdTable->ReopenOrDetach(fail_fn); android_fdsan_set_error_level(fdsan_error_level); } else { ALOGD("Forked child process %d" , pid); } UnblockSignal(SIGCHLD, fail_fn); return pid; }
在fork之前,还需要处理子进程信号和文件描述符问题。对于文件描述符有两个数组,fds_to_close中存放子进程需要关闭的文件描述符,fds_to_ignore中存放子进程需要继承的文件描述符,不过子进程会重新打开这些文件描述符,因此与Zygote并不是共享的。
真正执行fork操作的是fork()函数,我们重点分析这个方法。
fork fork采用copy-on-write 技术,这是linux创建进程的标准方法,调用一次,返回两次,返回值有3种类型:
在父进程中,fork返回新创建的子进程pid。
在子进程中,fork返回0;
当出现错误时,fork返回负数。
fork的主要工作是寻找空闲的进程号pid,然后从父进程拷贝进程信息,例如数据段和代码段,fork()后子进程要执行的代码等。
Zygote进程是所有Android进程的母体,包括system_server和各个App进程。
Zygote利用fork方法生成新进程,对于新进程A复用Zygote进程本身的资源,再加上新进程A相关的资源,构成新的应用进程A。
copy-on-write过程:
当父子进程任一方修改内存数据时(这是on-write时机),才发生缺页中断,从而分配新的物理内存(这是copy操作)。
copy-on-write原理:
写时拷贝是指子进程与父进程的页表都所指向同一个块物理内存,fork过程只拷贝父进程的页表,并标记这些页表是只读的。父子进程共用同一份物理内存,如果父子进程任一方想要修改这块物理内存,那么会触发缺页异常(page fault),Linux收到该中断便会创建新的物理内存,并将两个物理内存标记设置为可写状态,从而父子进程都有各自独立的物理内存。
了解到上述知识后,来看看fork的代码:
bionic/libc/bionic/fork.cpp
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 int fork () { __bionic_atfork_run_prepare(); pthread_internal_t * self = __get_thread(); int result = clone(nullptr , nullptr , (CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID | SIGCHLD), nullptr , nullptr , nullptr , &(self->tid)); if (result == 0 ) { self->set_cached_pid(gettid()); android_fdsan_set_error_level(ANDROID_FDSAN_ERROR_LEVEL_DISABLED); __bionic_atfork_run_child(); } else { __bionic_atfork_run_parent(); } return result; }
在执行clone的前后都有相应的回调方法:
__bionic_atfork_run_prepare: fork完成前,父进程回调方法
__bionic_atfork_run_child: fork完成后,子进程回调方法
__bionic_atfork_run_paren: fork完成后,父进程回调方法
此三个方法的实现都位于bionic/pthread_atfork.cpp中,有需要的话可以扩展这些方法。
SpecializeCommon 在fork完成之后,子进程已经创建完毕,此时子进程会调用SpecializeCommon()方法。
frameworks/base/core/jni/com_android_internal_os_Zygote.cpp
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 static void SpecializeCommon (JNIEnv* env, uid_t uid, gid_t gid, jintArray gids, jint runtime_flags, jobjectArray rlimits, jlong permitted_capabilities, jlong effective_capabilities, jint mount_external, jstring managed_se_info, jstring managed_nice_name, bool is_system_server, bool is_child_zygote, jstring managed_instruction_set, jstring managed_app_data_dir, bool is_top_app, jobjectArray pkg_data_info_list, jobjectArray allowlisted_data_info_list, bool mount_data_dirs, bool mount_storage_dirs) { ··· if (!is_system_server && getuid() == 0 ) { const int rc = createProcessGroup(uid, getpid()); if (rc == -EROFS) { ALOGW("createProcessGroup failed, kernel missing CONFIG_CGROUP_CPUACCT?" ); } else if (rc != 0 ) { ALOGE("createProcessGroup(%d, %d) failed: %s" , uid, 0 , strerror(-rc)); } } SetGids(env, gids, is_child_zygote, fail_fn); SetRLimits(env, rlimits, fail_fn); if (need_pre_initialize_native_bridge) { android::PreInitializeNativeBridge(app_data_dir.has_value() ? app_data_dir.value().c_str() : nullptr , instruction_set.value().c_str()); } if (setresgid(gid, gid, gid) == -1 ) { fail_fn(CREATE_ERROR("setresgid(%d) failed: %s" , gid, strerror(errno))); } SetUpSeccompFilter(uid, is_child_zygote); SetSchedulerPolicy(fail_fn, is_top_app); ··· if (nice_name.has_value()) { SetThreadName(nice_name.value()); } else if (is_system_server) { SetThreadName("system_server" ); } UnsetChldSignalHandler(); if (is_system_server) { env->CallStaticVoidMethod(gZygoteClass, gCallPostForkSystemServerHooks, runtime_flags); if (env->ExceptionCheck()) { fail_fn("Error calling post fork system server hooks." ); } static const char * kSystemServerLabel = "u:r:system_server:s0" ; if (selinux_android_setcon(kSystemServerLabel) != 0 ) { fail_fn(CREATE_ERROR("selinux_android_setcon(%s)" , kSystemServerLabel)); } } if (is_child_zygote) { initUnsolSocketToSystemServer(); } env->CallStaticVoidMethod(gZygoteClass, gCallPostForkChildHooks, runtime_flags, is_system_server, is_child_zygote, managed_instruction_set); setpriority(PRIO_PROCESS, 0 , PROCESS_PRIORITY_DEFAULT); if (env->ExceptionCheck()) { fail_fn("Error calling post fork hooks." ); } }
CallStaticVoidMethod方法使用JNI调用到了Zygote#callPostForkChildHooks方法。
Zygote#callPostForkChildHooks
frameworks/base/core/java/com/android/internal/os/Zygote.java
1 2 3 4 5 private static void callPostForkChildHooks (int runtimeFlags, boolean isSystemServer, boolean isZygote, String instructionSet) { ZygoteHooks.postForkChild(runtimeFlags, isSystemServer, isZygote, instructionSet); }
libcore/dalvik/src/main/java/dalvik/system/ZygoteHooks.java
1 2 3 4 5 6 7 @libcore .api.CorePlatformApipublic static void postForkChild (int runtimeFlags, boolean isSystemServer, boolean isZygote, String instructionSet) { nativePostForkChild(token, runtimeFlags, isSystemServer, isZygote, instructionSet); Math.setRandomSeedInternal(System.currentTimeMillis()); }
这里设置了进程的随机数种子为当前系统时间。
nativePostForkChild
art/runtime/native/dalvik_system_ZygoteHooks.cc
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 static void ZygoteHooks_nativePostForkChild (JNIEnv* env, jclass, jlong token, jint runtime_flags, jboolean is_system_server, jboolean is_zygote, jstring instruction_set) { DCHECK(!(is_system_server && is_zygote)); Thread* thread = reinterpret_cast <Thread*>(token); thread->InitAfterFork(); runtime_flags = EnableDebugFeatures(runtime_flags); ··· api_enforcement_policy = hiddenapi::EnforcementPolicyFromInt( (runtime_flags & HIDDEN_API_ENFORCEMENT_POLICY_MASK) >> API_ENFORCEMENT_POLICY_SHIFT); runtime_flags &= ~HIDDEN_API_ENFORCEMENT_POLICY_MASK; ··· runtime->GetHeap()->PostForkChildAction(thread); if (runtime->GetJit() != nullptr ) { if (!is_system_server) { runtime->GetJit()->GetCodeCache()->PostForkChildAction( false , is_zygote); } runtime->GetJit()->PostForkChildAction(is_system_server, is_zygote); } bool do_hidden_api_checks = api_enforcement_policy != hiddenapi::EnforcementPolicy::kDisabled; DCHECK(!(is_system_server && do_hidden_api_checks)) << "SystemServer should be forked with EnforcementPolicy::kDisable" ; DCHECK(!(is_zygote && do_hidden_api_checks)) << "Child zygote processes should be forked with EnforcementPolicy::kDisable" ; runtime->SetHiddenApiEnforcementPolicy(api_enforcement_policy); runtime->SetDedupeHiddenApiWarnings(true ); if (api_enforcement_policy != hiddenapi::EnforcementPolicy::kDisabled && runtime->GetHiddenApiEventLogSampleRate() != 0 ) { std ::srand(static_cast <uint32_t >(NanoTime())); } if (is_zygote) { return ; } if (instruction_set != nullptr && !is_system_server) { ScopedUtfChars isa_string (env, instruction_set) ; InstructionSet isa = GetInstructionSetFromString(isa_string.c_str()); Runtime::NativeBridgeAction action = Runtime::NativeBridgeAction::kUnload; if (isa != InstructionSet::kNone && isa != kRuntimeISA) { action = Runtime::NativeBridgeAction::kInitialize; } runtime->InitNonZygoteOrPostFork(env, is_system_server, action, isa_string.c_str()); } else { runtime->InitNonZygoteOrPostFork( env, is_system_server, Runtime::NativeBridgeAction::kUnload, nullptr , profile_system_server); } }
Runtime#InitNonZygoteOrPostFork
art/runtime/runtime.cc
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 void Runtime::InitNonZygoteOrPostFork ( JNIEnv* env, bool is_system_server, bool is_child_zygote, NativeBridgeAction action, const char * isa, bool profile_system_server) { ··· heap_->CreateThreadPool(); if (!is_system_server) { ScopedTrace timing ("CreateThreadPool" ) ; constexpr size_t kStackSize = 64 * KB; constexpr size_t kMaxRuntimeWorkers = 4u ; const size_t num_workers = std ::min(static_cast <size_t >(std ::thread::hardware_concurrency()), kMaxRuntimeWorkers); MutexLock mu (Thread::Current(), *Locks::runtime_thread_pool_lock_) ; CHECK(thread_pool_ == nullptr ); thread_pool_.reset(new ThreadPool("Runtime" , num_workers, false , kStackSize)); thread_pool_->StartWorkers(Thread::Current()); } heap_->ResetGcPerformanceInfo(); GetMetrics()->Reset(); ··· StartSignalCatcher(); ··· GetRuntimeCallbacks()->StartDebugger(); }
代码做了大量删减,只需要了解一下即可,不再继续深入。
ZygoteHooks#postForkCommon 在执行完fork操作后,我们需要恢复之前停止的几条守护线程。
libcore/dalvik/src/main/java/dalvik/system/ZygoteHooks.java
1 2 3 4 5 6 public static void postForkCommon () { nativePostZygoteFork(); Daemons.startPostZygoteFork(); }
Runtime#PostZygoteFork
art/runtime/runtime.cc
1 2 3 4 5 6 7 8 9 10 11 12 void Runtime::PostZygoteFork () { jit::Jit* jit = GetJit(); if (jit != nullptr ) { jit->PostZygoteFork(); if (kIsDebugBuild && jit->GetThreadPool() != nullptr ) { jit->GetThreadPool()->CheckPthreadPriority(jit->GetThreadPoolPthreadPriority()); } } ResetStats(0xFFFFFFFF ); }
Daemons#startPostZygoteFork
libcore/libart/src/main/java/java/lang/Daemons.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 public static void startPostZygoteFork () { postZygoteFork = true ; for (Daemon daemon : DAEMONS) { daemon.startPostZygoteFork(); } }public synchronized void startPostZygoteFork () { postZygoteFork = true ; startInternal(); }public void startInternal () { if (thread != null ) { throw new IllegalStateException("already running" ); } thread = new Thread(ThreadGroup.systemThreadGroup, this , name); thread.setDaemon(true ); thread.setSystemDaemon(true ); thread.start(); }
小结 本节中的内容都在Zygote进程中,主要做了以下工作:
Zygote通过Socket接收到SystemServer传递过来的创建进程请求。
Zygote通过Zygote.forkAndSpecialize方法创建子进程,这又分为几个步骤:
preFork:停止Zygote中的4条守护线程,预初始化JIT和GC堆。
nativeForkAndSpecialize:调用fork()函数来创建新进程,设置新进程的gid、主线程的tid,重置GC数据,设置信号处理函数,启动JDWP线程等。
postForkCommon:重新启动停止的4条守护线程并通知JIT。
本节时序图如下所示。
App进程初始化 在fork完成之后,子进程进入ZygoteConnection#handleChildProc方法中执行。
在Framework系列第一篇 中分析了在fork出System Server进程后,System Server做了几个工作:
初始化Log系统,设置时区、userAgent等
启动Binder线程池
执行Java main方法
此处的App进程与SystemServer的工作十分类似,最终都会调用到RuntimeInit中的findStaticMain方法来执行Java main方法。
1 2 3 4 protected static Runnable findStaticMain (String className, String[] argv, ClassLoader classLoader) { ··· }
唯一的区别是传入的className不同,App进程中的className在AMS中就被定义好,并通过Socket传递到Zygote进程中:
1 final String entryPoint = "android.app.ActivityThread" ;
下面我们就来看看ActivityThread的main方法做了什么操作。
ActivityThread#main 我们知道Android App实际上是一个Java程序,那么Java程序都会有一个main方法作为程序入口,而ActivityThread中的main方法就是整个App的入口。
frameworks/base/core/java/android/app/ActivityThread.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 public static void main (String[] args) { ··· Looper.prepareMainLooper(); ··· ActivityThread thread = new ActivityThread(); thread.attach(false , startSeq); if (sMainThreadHandler == null ) { sMainThreadHandler = thread.getHandler(); } ··· Looper.loop(); throw new RuntimeException("Main thread loop unexpectedly exited" ); }
main方法中创建了主线程Looper并且开启了消息循环。
Android的Handler机制相信大家已经相当熟悉,开启消息循环一方面能处理本线程和其他线程发送的事件,另一方面可以保证main方法一直在执行当中。
考虑到篇幅原因本篇不再继续深入分析attach方法,下面来总结一下上文。
总结 整个App进程的启动分为三个部分,涉及到三个进程。
SystemServer进程:通过ActivityManagerService.startProcess发起创建进程请求,会先收集新进程的各种参数例如uid、gid、niceName等,通过Socket发送给Zygote进程。
Zygote进程:接收到SystemServer发送过来的参数后封装成Argument对象,通过forkAndSpecialize核心方法fork自身创建App进程,主要分为几步:
preFork:停止Zygote的4个Daemon线程,仅保留主线程,初始化JIT和GC堆
nativeForkAndSpecialize:调用linux中fork()方法创建子进程,其中会创建Java堆线程的线程池、重置GC性能数据、设置子进程的崩溃信号处理函数、启动JDWP线程等
postForkCommon:重新启动守护线程
App进程:从handleChildProc()方法进入App进程的执行,初始化Binder驱动、创建Binder线程池,设置ART虚拟机的参数,反射调用Java main方法,开启消息循环。
其中AMS执行到ZygoteProcess中的attemptZygoteSendArgsAndGetResult会阻塞AMS线程,直到socket返回新进程的pid才会返回。
简化后的流程如下图所示。
参考
Android API 29 Platform
《Android进阶解密》
Android四大组件与进程启动的关系
理解Android进程创建流程